Feature engineering for electricity load forecasting#
The purpose of this notebook is to demonstrate how to use skrub and
polars to perform feature engineering for electricity load forecasting.
We will build a set of features (and targets) from different data sources:
Historical weather data for 10 medium to large urban areas in France;
Holidays and standard calendar features for France;
Historical electricity load data for the whole of France.
All these data sources cover a time range from March 23, 2021 to May 31, 2025.
Since our maximum forecasting horizon is 24 hours, we consider that the future weather data is known at a chosen prediction time. Similarly, the holidays and calendar features are known at prediction time for any point in the future.
Therefore, exogenous features derived from the weather and calendar data can be used to engineer “future covariates”. Since the load data is our prediction target, we will can also use it to engineer “past covariates” such as lagged features and rolling aggregations. The future values of the load data (with respect to the prediction time) are used as targets for the forecasting model.
Environment setup#
We need to install some extra dependencies for this notebook if needed (when running jupyterlite). We need the development version of skrub to be able to use the skrub expressions.
%pip install -q https://pypi.anaconda.org/ogrisel/simple/polars/1.24.0/polars-1.24.0-cp39-abi3-emscripten_3_1_58_wasm32.whl
%pip install -q https://pypi.anaconda.org/ogrisel/simple/skrub/0.6.dev0/skrub-0.6.dev0-py3-none-any.whl
%pip install -q altair holidays plotly nbformat
ERROR: polars-1.24.0-cp39-abi3-emscripten_3_1_58_wasm32.whl is not a supported wheel on this platform.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
The following 3 imports are only needed to workaround some limitations when using polars in a pyodide/jupyterlite notebook.
TODO: remove those workarounds once pyodide 0.28 is released with support for the latest polars version.
import tzdata # noqa: F401
import pandas as pd
from pyarrow.parquet import read_table
import altair
import numpy as np
import polars as pl
import skrub
from pathlib import Path
import holidays
import warnings
# Ignore warnings from pkg_resources triggered by Python 3.13's multiprocessing.
warnings.filterwarnings("ignore", category=UserWarning, module="pkg_resources")
Calendar and holidays features#
We leverage the holidays package to enrich the time range with some
calendar features such as public holidays in France. We also add some
features that are useful for time series forecasting such as the day of the
week, the day of the year, and the hour of the day.
Note that the holidays package requires us to extract the date for the
French timezone.
Similarly for the calendar features: all the time features are extracted from the time in the French timezone, since it is likely that electricity usage patterns are influenced by inhabitants’ daily routines aligned with the local timezone.
@skrub.deferred
def prepare_french_calendar_data(time):
fr_time = pl.col("time").dt.convert_time_zone("Europe/Paris")
fr_year_min = time.select(fr_time.dt.year().min()).item()
fr_year_max = time.select(fr_time.dt.year().max()).item()
holidays_fr = holidays.country_holidays(
"FR", years=range(fr_year_min, fr_year_max + 1)
)
return time.with_columns(
[
fr_time.dt.hour().alias("cal_hour_of_day"),
fr_time.dt.weekday().alias("cal_day_of_week"),
fr_time.dt.ordinal_day().alias("cal_day_of_year"),
fr_time.dt.year().alias("cal_year"),
fr_time.dt.date().is_in(holidays_fr.keys()).alias("cal_is_holiday"),
],
)
calendar = prepare_french_calendar_data(time)
calendar
Show graph
| time | cal_hour_of_day | cal_day_of_week | cal_day_of_year | cal_year | cal_is_holiday |
|---|---|---|---|---|---|
| 2021-03-23 00:00:00+00:00 | 1 | 2 | 82 | 2021 | False |
| 2021-03-23 01:00:00+00:00 | 2 | 2 | 82 | 2021 | False |
| 2021-03-23 02:00:00+00:00 | 3 | 2 | 82 | 2021 | False |
| 2021-03-23 03:00:00+00:00 | 4 | 2 | 82 | 2021 | False |
| 2021-03-23 04:00:00+00:00 | 5 | 2 | 82 | 2021 | False |
| 2025-05-31 19:00:00+00:00 | 21 | 6 | 151 | 2025 | False |
| 2025-05-31 20:00:00+00:00 | 22 | 6 | 151 | 2025 | False |
| 2025-05-31 21:00:00+00:00 | 23 | 6 | 151 | 2025 | False |
| 2025-05-31 22:00:00+00:00 | 0 | 7 | 152 | 2025 | False |
| 2025-05-31 23:00:00+00:00 | 1 | 7 | 152 | 2025 | False |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
cal_hour_of_day
Int8- Null values
- 0 (0.0%)
- Unique values
- 24 (< 0.1%)
- Mean ± Std
- 11.5 ± 6.92
- Median ± IQR
- 12.0 ± 11.0
- Min | Max
- 0.00 | 23.0
cal_day_of_week
Int8- Null values
- 0 (0.0%)
- Unique values
- 7 (< 0.1%)
- Mean ± Std
- 4.00 ± 2.00
- Median ± IQR
- 4.00 ± 4.00
- Min | Max
- 1.00 | 7.00
cal_day_of_year
Int16- Null values
- 0 (0.0%)
- Unique values
- 366 (1.0%)
- Mean ± Std
- 180. ± 104.
- Median ± IQR
- 174. ± 177.
- Min | Max
- 1.00 | 366.
cal_year
Int32- Null values
- 0 (0.0%)
- Unique values
- 5 (< 0.1%)
- Mean ± Std
- 2.02e+03 ± 1.26
- Median ± IQR
- 2.02e+03 ± 2.00
- Min | Max
- 2.02e+03 | 2.02e+03
cal_is_holiday
Boolean- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | cal_hour_of_day | Int8 | 0 (0.0%) | 24 (< 0.1%) | 11.5 | 6.92 | 0.00 | 12.0 | 23.0 |
| 2 | cal_day_of_week | Int8 | 0 (0.0%) | 7 (< 0.1%) | 4.00 | 2.00 | 1.00 | 4.00 | 7.00 |
| 3 | cal_day_of_year | Int16 | 0 (0.0%) | 366 (1.0%) | 180. | 104. | 1.00 | 174. | 366. |
| 4 | cal_year | Int32 | 0 (0.0%) | 5 (< 0.1%) | 2.02e+03 | 1.26 | 2.02e+03 | 2.02e+03 | 2.02e+03 |
| 5 | cal_is_holiday | Boolean | 0 (0.0%) | 2 (< 0.1%) |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Electricity load data#
Finally we load the electricity load data. This data will both be used as a target variable but also to craft some lagged and window-aggregated features.
@skrub.deferred
def load_electricity_load_data(time, data_source_folder):
"""Load and aggregate historical load data from the raw CSV files."""
load_data_files = [
data_file
for data_file in sorted(data_source_folder.iterdir())
if data_file.name.startswith("Total Load - Day Ahead")
and data_file.name.endswith(".csv")
]
return time.join(
(
pl.concat(
[
pl.from_pandas(pd.read_csv(data_file, na_values=["N/A", "-"])).drop(
["Day-ahead Total Load Forecast [MW] - BZN|FR"]
)
for data_file in load_data_files
]
).select(
[
pl.col("Time (UTC)")
.str.split(by=" - ")
.list.first()
.str.to_datetime("%d.%m.%Y %H:%M", time_zone="UTC")
.alias("time"),
pl.col("Actual Total Load [MW] - BZN|FR").alias("load_mw"),
]
)
),
on="time",
)
electricity = load_electricity_load_data(time, data_source_folder)
electricity
Show graph
| time | load_mw |
|---|---|
| 2021-03-23 00:00:00+00:00 | 59823.0 |
| 2021-03-23 01:00:00+00:00 | 59369.0 |
| 2021-03-23 02:00:00+00:00 | 57550.0 |
| 2021-03-23 03:00:00+00:00 | 57188.0 |
| 2021-03-23 04:00:00+00:00 | 60367.0 |
| 2025-05-31 19:00:00+00:00 | 39069.0 |
| 2025-05-31 20:00:00+00:00 | 40387.0 |
| 2025-05-31 21:00:00+00:00 | 41174.0 |
| 2025-05-31 22:00:00+00:00 | 39664.0 |
| 2025-05-31 23:00:00+00:00 | 36067.0 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 36 (< 0.1%)
- Unique values
- 23,318 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | load_mw | Float64 | 36 (< 0.1%) | 23318 (63.5%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
electricity.filter(pl.col("load_mw").is_null())
Show graph
| time | load_mw |
|---|---|
| 2021-05-12 08:00:00+00:00 | |
| 2021-05-19 04:00:00+00:00 | |
| 2021-06-03 16:00:00+00:00 | |
| 2021-10-31 00:00:00+00:00 | |
| 2021-10-31 01:00:00+00:00 | |
| 2023-03-26 00:00:00+00:00 | |
| 2023-04-17 12:00:00+00:00 | |
| 2023-04-17 13:00:00+00:00 | |
| 2024-12-31 23:00:00+00:00 | |
| 2025-03-30 02:00:00+00:00 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36 (100.0%)
- Min | Max
- 2021-05-12T08:00:00+00:00 | 2025-03-30T02:00:00+00:00
load_mw
Float64- Null values
- 36 (100.0%)
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36 (100.0%) | 2021-05-12T08:00:00+00:00 | 2025-03-30T02:00:00+00:00 | |||
| 1 | load_mw | Float64 | 36 (100.0%) |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
electricity.filter(
(pl.col("time") > pl.datetime(2021, 10, 30, hour=10, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 10, 31, hour=10, time_zone="UTC"))
).skb.eval().plot.line(x="time:T", y="load_mw:Q")
electricity = electricity.with_columns([pl.col("load_mw").interpolate()])
electricity.filter(
(pl.col("time") > pl.datetime(2021, 10, 30, hour=10, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 10, 31, hour=10, time_zone="UTC"))
).skb.eval().plot.line(x="time:T", y="load_mw:Q")
Lagged features#
We can now create some lagged features from the electricity load data.
We will create 3 hourly lagged features, 1 daily lagged feature, and 1 weekly lagged feature. We will also create a rolling median and inter-quartile feature over the last 24 hours and over the last 7 days.
def iqr(col, *, window_size: int):
"""Inter-quartile range (IQR) of a column."""
return col.rolling_quantile(0.75, window_size=window_size) - col.rolling_quantile(
0.25, window_size=window_size
)
electricity_lagged = electricity.with_columns(
[pl.col("load_mw").shift(i).alias(f"load_mw_lag_{i}h") for i in range(1, 4)]
+ [
pl.col("load_mw").shift(24).alias("load_mw_lag_1d"),
pl.col("load_mw").shift(24 * 7).alias("load_mw_lag_1w"),
pl.col("load_mw")
.rolling_median(window_size=24)
.alias("load_mw_rolling_median_24h"),
pl.col("load_mw")
.rolling_median(window_size=24 * 7)
.alias("load_mw_rolling_median_7d"),
iqr(pl.col("load_mw"), window_size=24).alias("load_mw_iqr_24h"),
iqr(pl.col("load_mw"), window_size=24 * 7).alias("load_mw_iqr_7d"),
],
)
electricity_lagged
Show graph
| time | load_mw | load_mw_lag_1h | load_mw_lag_2h | load_mw_lag_3h | load_mw_lag_1d | load_mw_lag_1w | load_mw_rolling_median_24h | load_mw_rolling_median_7d | load_mw_iqr_24h | load_mw_iqr_7d |
|---|---|---|---|---|---|---|---|---|---|---|
| 2021-03-23 00:00:00+00:00 | 59823.0 | |||||||||
| 2021-03-23 01:00:00+00:00 | 59369.0 | 59823.0 | ||||||||
| 2021-03-23 02:00:00+00:00 | 57550.0 | 59369.0 | 59823.0 | |||||||
| 2021-03-23 03:00:00+00:00 | 57188.0 | 57550.0 | 59369.0 | 59823.0 | ||||||
| 2021-03-23 04:00:00+00:00 | 60367.0 | 57188.0 | 57550.0 | 59369.0 | ||||||
| 2025-05-31 19:00:00+00:00 | 39069.0 | 39980.0 | 40890.0 | 40175.0 | 41584.0 | 39144.0 | 39356.0 | 40659.0 | 4231.0 | 7238.0 |
| 2025-05-31 20:00:00+00:00 | 40387.0 | 39069.0 | 39980.0 | 40890.0 | 42931.0 | 40286.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 21:00:00+00:00 | 41174.0 | 40387.0 | 39069.0 | 39980.0 | 43812.0 | 41468.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 22:00:00+00:00 | 39664.0 | 41174.0 | 40387.0 | 39069.0 | 41966.0 | 40346.0 | 39356.0 | 40659.0 | 4140.0 | 7238.0 |
| 2025-05-31 23:00:00+00:00 | 36067.0 | 39664.0 | 41174.0 | 40387.0 | 38248.0 | 37076.0 | 39356.0 | 40659.0 | 4823.0 | 7239.0 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 1 (< 0.1%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 2 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 3 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 24 (< 0.1%)
- Unique values
- 23,342 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 168 (0.5%)
- Unique values
- 23,293 (63.4%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 9,644 (26.2%)
- Mean ± Std
- 5.06e+04 ± 9.28e+03
- Median ± IQR
- 4.75e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 7,138 (19.4%)
- Mean ± Std
- 5.01e+04 ± 8.82e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 5,922 (16.1%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 5,327 (14.5%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.27e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | load_mw | Float64 | 0 (0.0%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 2 | load_mw_lag_1h | Float64 | 1 (< 0.1%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 3 | load_mw_lag_2h | Float64 | 2 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 4 | load_mw_lag_3h | Float64 | 3 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 5 | load_mw_lag_1d | Float64 | 24 (< 0.1%) | 23342 (63.5%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 6 | load_mw_lag_1w | Float64 | 168 (0.5%) | 23293 (63.4%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.82e+04 | 8.66e+04 |
| 7 | load_mw_rolling_median_24h | Float64 | 23 (< 0.1%) | 9644 (26.2%) | 5.06e+04 | 9.28e+03 | 3.37e+04 | 4.75e+04 | 7.84e+04 |
| 8 | load_mw_rolling_median_7d | Float64 | 167 (0.5%) | 7138 (19.4%) | 5.01e+04 | 8.82e+03 | 3.85e+04 | 4.60e+04 | 7.39e+04 |
| 9 | load_mw_iqr_24h | Float64 | 23 (< 0.1%) | 5922 (16.1%) | 6.52e+03 | 1.56e+03 | 2.32e+03 | 6.43e+03 | 1.60e+04 |
| 10 | load_mw_iqr_7d | Float64 | 167 (0.5%) | 5327 (14.5%) | 8.30e+03 | 1.41e+03 | 5.04e+03 | 8.27e+03 | 1.86e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
altair.Chart(electricity_lagged.tail(100).skb.eval()).transform_fold(
[
"load_mw",
"load_mw_lag_1h",
"load_mw_lag_2h",
"load_mw_lag_3h",
"load_mw_lag_1d",
"load_mw_lag_1w",
"load_mw_rolling_median_24h",
"load_mw_rolling_median_7d",
"load_mw_rolling_iqr_24h",
"load_mw_rolling_iqr_7d",
],
as_=["key", "load_mw"],
).mark_line(tooltip=True).encode(x="time:T", y="load_mw:Q", color="key:N").interactive()
Remark lagged features engineering and system lag#
When working with historical data, we often have access to all the past measurements in the dataset. However, when we want to use the lagged features in a forecasting model, we need to be careful about the length of the system lag: the time between a timestamped measurement is made in the real world and the time the record is made available to the downstream application (in our case, a deployed predictive pipeline).
System lag is rarely explicitly represented in the data sources even if such delay can be as large as several hours or even days and can sometimes be irregular. For instance, if there is a human intervention in the data recording process, holidays and weekends can punctually add significant delay.
If the system lag is larger than the maximum feature engineering lag, the resulting features be filled with missing values once deployed. More importantly, if the system lag is not handled explicitly, those resulting missing values will only be present in the features computed for the deployed system but not present in the features computed to train and backtest the system before deployment.
This structural discrepancy can severely degrade the performance of the deployed model compared to the performance estimated from backtesting on the historical data.
We will set this problem aside for now but discuss it again in a later section of this tutorial.
Investigating outliers in the lagged features#
Let’s use the skrub.TableReport tool to look at the plots of the marginal
distribution of the lagged features.
from skrub import TableReport
TableReport(electricity_lagged.skb.eval())
Processing column 1 / 11
Processing column 2 / 11
Processing column 3 / 11
Processing column 4 / 11
Processing column 5 / 11
Processing column 6 / 11
Processing column 7 / 11
Processing column 8 / 11
Processing column 9 / 11
Processing column 10 / 11
Processing column 11 / 11
| time | load_mw | load_mw_lag_1h | load_mw_lag_2h | load_mw_lag_3h | load_mw_lag_1d | load_mw_lag_1w | load_mw_rolling_median_24h | load_mw_rolling_median_7d | load_mw_iqr_24h | load_mw_iqr_7d |
|---|---|---|---|---|---|---|---|---|---|---|
| 2021-03-23 00:00:00+00:00 | 59823.0 | |||||||||
| 2021-03-23 01:00:00+00:00 | 59369.0 | 59823.0 | ||||||||
| 2021-03-23 02:00:00+00:00 | 57550.0 | 59369.0 | 59823.0 | |||||||
| 2021-03-23 03:00:00+00:00 | 57188.0 | 57550.0 | 59369.0 | 59823.0 | ||||||
| 2021-03-23 04:00:00+00:00 | 60367.0 | 57188.0 | 57550.0 | 59369.0 | ||||||
| 2025-05-31 19:00:00+00:00 | 39069.0 | 39980.0 | 40890.0 | 40175.0 | 41584.0 | 39144.0 | 39356.0 | 40659.0 | 4231.0 | 7238.0 |
| 2025-05-31 20:00:00+00:00 | 40387.0 | 39069.0 | 39980.0 | 40890.0 | 42931.0 | 40286.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 21:00:00+00:00 | 41174.0 | 40387.0 | 39069.0 | 39980.0 | 43812.0 | 41468.0 | 39356.0 | 40659.0 | 4159.0 | 7238.0 |
| 2025-05-31 22:00:00+00:00 | 39664.0 | 41174.0 | 40387.0 | 39069.0 | 41966.0 | 40346.0 | 39356.0 | 40659.0 | 4140.0 | 7238.0 |
| 2025-05-31 23:00:00+00:00 | 36067.0 | 39664.0 | 41174.0 | 40387.0 | 38248.0 | 37076.0 | 39356.0 | 40659.0 | 4823.0 | 7239.0 |
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 1 (< 0.1%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 2 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 3 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 24 (< 0.1%)
- Unique values
- 23,342 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 168 (0.5%)
- Unique values
- 23,293 (63.4%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 9,644 (26.2%)
- Mean ± Std
- 5.06e+04 ± 9.28e+03
- Median ± IQR
- 4.75e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 7,138 (19.4%)
- Mean ± Std
- 5.01e+04 ± 8.82e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 5,922 (16.1%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 5,327 (14.5%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.27e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | time | Datetime | 0 (0.0%) | 36744 (100.0%) | 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00 | |||
| 1 | load_mw | Float64 | 0 (0.0%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 2 | load_mw_lag_1h | Float64 | 1 (< 0.1%) | 23353 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 3 | load_mw_lag_2h | Float64 | 2 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 4 | load_mw_lag_3h | Float64 | 3 (< 0.1%) | 23352 (63.6%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 5 | load_mw_lag_1d | Float64 | 24 (< 0.1%) | 23342 (63.5%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 6 | load_mw_lag_1w | Float64 | 168 (0.5%) | 23293 (63.4%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.82e+04 | 8.66e+04 |
| 7 | load_mw_rolling_median_24h | Float64 | 23 (< 0.1%) | 9644 (26.2%) | 5.06e+04 | 9.28e+03 | 3.37e+04 | 4.75e+04 | 7.84e+04 |
| 8 | load_mw_rolling_median_7d | Float64 | 167 (0.5%) | 7138 (19.4%) | 5.01e+04 | 8.82e+03 | 3.85e+04 | 4.60e+04 | 7.39e+04 |
| 9 | load_mw_iqr_24h | Float64 | 23 (< 0.1%) | 5922 (16.1%) | 6.52e+03 | 1.56e+03 | 2.32e+03 | 6.43e+03 | 1.60e+04 |
| 10 | load_mw_iqr_7d | Float64 | 167 (0.5%) | 5327 (14.5%) | 8.30e+03 | 1.41e+03 | 5.04e+03 | 8.27e+03 | 1.86e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,744 (100.0%)
- Min | Max
- 2021-03-23T00:00:00+00:00 | 2025-05-31T23:00:00+00:00
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 1 (< 0.1%)
- Unique values
- 23,353 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 2 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 3 (< 0.1%)
- Unique values
- 23,352 (63.6%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 24 (< 0.1%)
- Unique values
- 23,342 (63.5%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 168 (0.5%)
- Unique values
- 23,293 (63.4%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 9,644 (26.2%)
- Mean ± Std
- 5.06e+04 ± 9.28e+03
- Median ± IQR
- 4.75e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 7,138 (19.4%)
- Mean ± Std
- 5.01e+04 ± 8.82e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 23 (< 0.1%)
- Unique values
- 5,922 (16.1%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 167 (0.5%)
- Unique values
- 5,327 (14.5%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.27e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column 1 | Column 2 | Cramér's V | Pearson's Correlation |
|---|---|---|---|
| load_mw_lag_1h | load_mw_lag_2h | 0.723 | 0.983 |
| load_mw_lag_2h | load_mw_lag_3h | 0.723 | 0.982 |
| load_mw | load_mw_lag_1h | 0.697 | 0.982 |
| load_mw_lag_1d | load_mw_rolling_median_24h | 0.679 | 0.889 |
| load_mw_lag_1w | load_mw_rolling_median_7d | 0.610 | 0.834 |
| load_mw | load_mw_lag_1d | 0.594 | 0.929 |
| load_mw_rolling_median_24h | load_mw_rolling_median_7d | 0.558 | 0.917 |
| load_mw_lag_1h | load_mw_lag_3h | 0.556 | 0.944 |
| load_mw | load_mw_lag_2h | 0.553 | 0.944 |
| load_mw_lag_1h | load_mw_lag_1d | 0.550 | 0.917 |
| load_mw_lag_3h | load_mw_rolling_median_24h | 0.528 | 0.896 |
| load_mw_lag_1h | load_mw_rolling_median_24h | 0.527 | 0.886 |
| load_mw_lag_2h | load_mw_rolling_median_24h | 0.526 | 0.890 |
| load_mw | load_mw_rolling_median_24h | 0.521 | 0.883 |
| load_mw_lag_2h | load_mw_lag_1d | 0.491 | 0.886 |
| load_mw | load_mw_lag_1w | 0.491 | 0.873 |
| load_mw_lag_1d | load_mw_lag_1w | 0.491 | 0.845 |
| load_mw_rolling_median_7d | load_mw_iqr_7d | 0.488 | 0.218 |
| load_mw_lag_1d | load_mw_rolling_median_7d | 0.486 | 0.853 |
| load_mw_rolling_median_24h | load_mw_iqr_24h | 0.473 | 0.255 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Let’s extract the dates where the inter-quartile range of the load is greater than 15,000 MW.
electricity_lagged.filter(pl.col("load_mw_iqr_7d") > 15_000)[
"time"
].dt.date().unique().sort().to_list().skb.eval()
[datetime.date(2021, 12, 26),
datetime.date(2021, 12, 27),
datetime.date(2021, 12, 28),
datetime.date(2022, 1, 7),
datetime.date(2022, 1, 8),
datetime.date(2023, 1, 19),
datetime.date(2023, 1, 20),
datetime.date(2023, 1, 21),
datetime.date(2024, 1, 10),
datetime.date(2024, 1, 11),
datetime.date(2024, 1, 12),
datetime.date(2024, 1, 13)]
We observe 3 date ranges with high inter-quartile range. Let’s plot the electricity load and the lagged features for the first data range along with the weather data for Paris.
altair.Chart(
electricity_lagged.filter(
(pl.col("time") > pl.datetime(2021, 12, 1, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 12, 31, time_zone="UTC"))
).skb.eval()
).transform_fold(
[
"load_mw",
"load_mw_iqr_7d",
],
).mark_line(
tooltip=True
).encode(
x="time:T", y="value:Q", color="key:N"
).interactive()
altair.Chart(
all_city_weather.filter(
(pl.col("time") > pl.datetime(2021, 12, 1, time_zone="UTC"))
& (pl.col("time") < pl.datetime(2021, 12, 31, time_zone="UTC"))
).skb.eval()
).transform_fold(
[f"weather_temperature_2m_{city_name}" for city_name in city_names.skb.eval()],
).mark_line(
tooltip=True
).encode(
x="time:T", y="value:Q", color="key:N"
).interactive()
Based on the plots above, we can see that the electricity load was high just before the Christmas holidays due to low temperatures. Then the load suddenly dropped because temperatures went higher right at the start of the end-of-year holidays.
So those outliers do not seem to be caused to a data quality issue but rather due to a real change in the electricity load demand. We could conduct similar analysis for the other date ranges with high inter-quartile range but we will skip that for now.
If we had observed significant data quality issues over extended periods of
time could have been addressed by removing the corresponding rows from the
dataset. However, this would make the lagged and windowing feature
engineering challenging to reimplement correctly. A better approach would be
to keep a contiguous dataset assign 0 weights to the affected rows when
fitting or evaluating the trained models via the use of the sample_weight
parameter.
Final dataset#
We now assemble the dataset that will be used to train and evaluate the forecasting models via backtesting.
prediction_start_time = skrub.var(
"prediction_start_time", historical_data_start_time.skb.eval() + pl.duration(days=7)
)
prediction_end_time = skrub.var(
"prediction_end_time", historical_data_end_time.skb.eval() - pl.duration(hours=24)
)
@skrub.deferred
def define_prediction_time_range(prediction_start_time, prediction_end_time):
return pl.DataFrame().with_columns(
pl.datetime_range(
start=prediction_start_time,
end=prediction_end_time,
time_zone="UTC",
interval="1h",
).alias("prediction_time"),
)
prediction_time = define_prediction_time_range(
prediction_start_time, prediction_end_time
)
prediction_time
Show graph
| prediction_time |
|---|
| 2021-03-30 00:00:00+00:00 |
| 2021-03-30 01:00:00+00:00 |
| 2021-03-30 02:00:00+00:00 |
| 2021-03-30 03:00:00+00:00 |
| 2021-03-30 04:00:00+00:00 |
| 2025-05-30 19:00:00+00:00 |
| 2025-05-30 20:00:00+00:00 |
| 2025-05-30 21:00:00+00:00 |
| 2025-05-30 22:00:00+00:00 |
| 2025-05-30 23:00:00+00:00 |
prediction_time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,552 (100.0%)
- Min | Max
- 2021-03-30T00:00:00+00:00 | 2025-05-30T23:00:00+00:00
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | prediction_time | Datetime | 0 (0.0%) | 36552 (100.0%) | 2021-03-30T00:00:00+00:00 | 2025-05-30T23:00:00+00:00 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
@skrub.deferred
def build_features(
prediction_time,
electricity_lagged,
all_city_weather,
calendar,
future_feature_horizons=[1, 24],
):
return (
prediction_time.join(
electricity_lagged, left_on="prediction_time", right_on="time"
)
.join(
all_city_weather.select(
[pl.col("time")]
+ [
pl.col(c).shift(-h).alias(c + f"_future_{h}h")
for c in all_city_weather.columns
if c != "time"
for h in future_feature_horizons
]
),
left_on="prediction_time",
right_on="time",
)
.join(
calendar.select(
[pl.col("time")]
+ [
pl.col(c).shift(-h).alias(c + f"_future_{h}h")
for c in calendar.columns
if c != "time"
for h in future_feature_horizons
]
),
left_on="prediction_time",
right_on="time",
)
).drop("prediction_time")
features = build_features(
prediction_time=prediction_time,
electricity_lagged=electricity_lagged,
all_city_weather=all_city_weather,
calendar=calendar,
).skb.mark_as_X()
features
Show graph
| load_mw | load_mw_lag_1h | load_mw_lag_2h | load_mw_lag_3h | load_mw_lag_1d | load_mw_lag_1w | load_mw_rolling_median_24h | load_mw_rolling_median_7d | load_mw_iqr_24h | load_mw_iqr_7d | weather_temperature_2m_paris_future_1h | weather_temperature_2m_paris_future_24h | weather_precipitation_paris_future_1h | weather_precipitation_paris_future_24h | weather_wind_speed_10m_paris_future_1h | weather_wind_speed_10m_paris_future_24h | weather_cloud_cover_paris_future_1h | weather_cloud_cover_paris_future_24h | weather_soil_moisture_1_to_3cm_paris_future_1h | weather_soil_moisture_1_to_3cm_paris_future_24h | weather_relative_humidity_2m_paris_future_1h | weather_relative_humidity_2m_paris_future_24h | weather_temperature_2m_lyon_future_1h | weather_temperature_2m_lyon_future_24h | weather_precipitation_lyon_future_1h | weather_precipitation_lyon_future_24h | weather_wind_speed_10m_lyon_future_1h | weather_wind_speed_10m_lyon_future_24h | weather_cloud_cover_lyon_future_1h | weather_cloud_cover_lyon_future_24h | weather_soil_moisture_1_to_3cm_lyon_future_1h | weather_soil_moisture_1_to_3cm_lyon_future_24h | weather_relative_humidity_2m_lyon_future_1h | weather_relative_humidity_2m_lyon_future_24h | weather_temperature_2m_marseille_future_1h | weather_temperature_2m_marseille_future_24h | weather_precipitation_marseille_future_1h | weather_precipitation_marseille_future_24h | weather_wind_speed_10m_marseille_future_1h | weather_wind_speed_10m_marseille_future_24h | weather_cloud_cover_marseille_future_1h | weather_cloud_cover_marseille_future_24h | weather_soil_moisture_1_to_3cm_marseille_future_1h | weather_soil_moisture_1_to_3cm_marseille_future_24h | weather_relative_humidity_2m_marseille_future_1h | weather_relative_humidity_2m_marseille_future_24h | weather_temperature_2m_toulouse_future_1h | weather_temperature_2m_toulouse_future_24h | weather_precipitation_toulouse_future_1h | weather_precipitation_toulouse_future_24h | weather_wind_speed_10m_toulouse_future_1h | weather_wind_speed_10m_toulouse_future_24h | weather_cloud_cover_toulouse_future_1h | weather_cloud_cover_toulouse_future_24h | weather_soil_moisture_1_to_3cm_toulouse_future_1h | weather_soil_moisture_1_to_3cm_toulouse_future_24h | weather_relative_humidity_2m_toulouse_future_1h | weather_relative_humidity_2m_toulouse_future_24h | weather_temperature_2m_lille_future_1h | weather_temperature_2m_lille_future_24h | weather_precipitation_lille_future_1h | weather_precipitation_lille_future_24h | weather_wind_speed_10m_lille_future_1h | weather_wind_speed_10m_lille_future_24h | weather_cloud_cover_lille_future_1h | weather_cloud_cover_lille_future_24h | weather_soil_moisture_1_to_3cm_lille_future_1h | weather_soil_moisture_1_to_3cm_lille_future_24h | weather_relative_humidity_2m_lille_future_1h | weather_relative_humidity_2m_lille_future_24h | weather_temperature_2m_limoges_future_1h | weather_temperature_2m_limoges_future_24h | weather_precipitation_limoges_future_1h | weather_precipitation_limoges_future_24h | weather_wind_speed_10m_limoges_future_1h | weather_wind_speed_10m_limoges_future_24h | weather_cloud_cover_limoges_future_1h | weather_cloud_cover_limoges_future_24h | weather_soil_moisture_1_to_3cm_limoges_future_1h | weather_soil_moisture_1_to_3cm_limoges_future_24h | weather_relative_humidity_2m_limoges_future_1h | weather_relative_humidity_2m_limoges_future_24h | weather_temperature_2m_nantes_future_1h | weather_temperature_2m_nantes_future_24h | weather_precipitation_nantes_future_1h | weather_precipitation_nantes_future_24h | weather_wind_speed_10m_nantes_future_1h | weather_wind_speed_10m_nantes_future_24h | weather_cloud_cover_nantes_future_1h | weather_cloud_cover_nantes_future_24h | weather_soil_moisture_1_to_3cm_nantes_future_1h | weather_soil_moisture_1_to_3cm_nantes_future_24h | weather_relative_humidity_2m_nantes_future_1h | weather_relative_humidity_2m_nantes_future_24h | weather_temperature_2m_strasbourg_future_1h | weather_temperature_2m_strasbourg_future_24h | weather_precipitation_strasbourg_future_1h | weather_precipitation_strasbourg_future_24h | weather_wind_speed_10m_strasbourg_future_1h | weather_wind_speed_10m_strasbourg_future_24h | weather_cloud_cover_strasbourg_future_1h | weather_cloud_cover_strasbourg_future_24h | weather_soil_moisture_1_to_3cm_strasbourg_future_1h | weather_soil_moisture_1_to_3cm_strasbourg_future_24h | weather_relative_humidity_2m_strasbourg_future_1h | weather_relative_humidity_2m_strasbourg_future_24h | weather_temperature_2m_brest_future_1h | weather_temperature_2m_brest_future_24h | weather_precipitation_brest_future_1h | weather_precipitation_brest_future_24h | weather_wind_speed_10m_brest_future_1h | weather_wind_speed_10m_brest_future_24h | weather_cloud_cover_brest_future_1h | weather_cloud_cover_brest_future_24h | weather_soil_moisture_1_to_3cm_brest_future_1h | weather_soil_moisture_1_to_3cm_brest_future_24h | weather_relative_humidity_2m_brest_future_1h | weather_relative_humidity_2m_brest_future_24h | weather_temperature_2m_bayonne_future_1h | weather_temperature_2m_bayonne_future_24h | weather_precipitation_bayonne_future_1h | weather_precipitation_bayonne_future_24h | weather_wind_speed_10m_bayonne_future_1h | weather_wind_speed_10m_bayonne_future_24h | weather_cloud_cover_bayonne_future_1h | weather_cloud_cover_bayonne_future_24h | weather_soil_moisture_1_to_3cm_bayonne_future_1h | weather_soil_moisture_1_to_3cm_bayonne_future_24h | weather_relative_humidity_2m_bayonne_future_1h | weather_relative_humidity_2m_bayonne_future_24h | cal_hour_of_day_future_1h | cal_hour_of_day_future_24h | cal_day_of_week_future_1h | cal_day_of_week_future_24h | cal_day_of_year_future_1h | cal_day_of_year_future_24h | cal_year_future_1h | cal_year_future_24h | cal_is_holiday_future_1h | cal_is_holiday_future_24h |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 46395.0 | 47401.0 | 49217.0 | 51561.0 | 48600.0 | 59823.0 | 51122.5 | 54884.5 | 7834.0 | 8199.0 | 13.06450080871582 | 16.06450080871582 | 0.0 | 0.0 | 5.804825305938721 | 3.9600000381469727 | 100.0 | 0.0 | 64.0 | 66.0 | 10.585000038146973 | 13.135000228881836 | 0.0 | 0.0 | 4.553679466247559 | 5.759999752044678 | 63.0 | 0.0 | 65.0 | 55.0 | 14.328999519348145 | 15.078999519348145 | 0.0 | 0.0 | 1.527350664138794 | 5.191993713378906 | 0.0 | 0.0 | 72.0 | 59.0 | 11.582500457763672 | 12.482500076293945 | 0.0 | 0.0 | 19.40663719177246 | 16.831684112548828 | 0.0 | 6.0 | 67.0 | 54.0 | 10.399999618530273 | 13.149999618530273 | 0.0 | 0.0 | 7.56856632232666 | 8.20926284790039 | 48.0 | 0.0 | 67.0 | 77.0 | 7.401000022888184 | 8.901000022888184 | 0.0 | 0.0 | 4.320000171661377 | 4.452953815460205 | 0.0 | 0.0 | 85.0 | 85.0 | 9.527999877929688 | 11.628000259399414 | 0.0 | 0.0 | 10.972620010375977 | 9.44957160949707 | 0.0 | 0.0 | 86.0 | 81.0 | 10.337000846862793 | 12.53700065612793 | 0.0 | 0.0 | 1.1384198665618896 | 5.411986351013184 | 17.0 | 19.0 | 62.0 | 76.0 | 10.32800006866455 | 9.727999687194824 | 0.0 | 0.0 | 13.532360076904297 | 10.315114974975586 | 11.0 | 7.0 | 83.0 | 76.0 | 12.29800033569336 | 13.89799976348877 | 0.0 | 0.0 | 9.693296432495117 | 8.20926284790039 | 10.0 | 0.0 | 64.0 | 58.0 | 3 | 2 | 2 | 3 | 89 | 90 | 2021 | 2021 | False | False | ||||||||||||||||||||
| 44269.0 | 46395.0 | 47401.0 | 49217.0 | 46722.0 | 59369.0 | 51122.5 | 54857.5 | 7834.0 | 8252.0 | 12.614500045776367 | 15.514500617980957 | 0.0 | 0.0 | 5.804825305938721 | 4.33497428894043 | 100.0 | 0.0 | 65.0 | 68.0 | 10.135000228881836 | 12.6850004196167 | 0.0 | 0.0 | 5.091168403625488 | 5.411986351013184 | 100.0 | 0.0 | 65.0 | 53.0 | 14.178999900817871 | 15.029000282287598 | 0.0 | 0.0 | 1.8356469869613647 | 5.399999618530273 | 0.0 | 0.0 | 72.0 | 59.0 | 11.432499885559082 | 12.232500076293945 | 0.0 | 0.0 | 19.914215087890625 | 16.831684112548828 | 0.0 | 11.0 | 68.0 | 55.0 | 10.050000190734863 | 12.75 | 0.0 | 0.0 | 7.072877883911133 | 7.695919990539551 | 58.0 | 0.0 | 67.0 | 79.0 | 7.151000022888184 | 8.401000022888184 | 0.0 | 0.0 | 4.802999019622803 | 4.802999019622803 | 0.0 | 0.0 | 85.0 | 85.0 | 9.378000259399414 | 11.378000259399414 | 0.0 | 0.0 | 10.703569412231445 | 9.885261535644531 | 6.0 | 5.0 | 84.0 | 80.0 | 9.88700008392334 | 12.03700065612793 | 0.0 | 0.0 | 1.7999999523162842 | 6.119999885559082 | 42.0 | 22.0 | 63.0 | 78.0 | 10.428000450134277 | 9.378000259399414 | 0.0 | 0.0 | 14.489720344543457 | 9.255571365356445 | 17.0 | 7.0 | 82.0 | 79.0 | 11.89799976348877 | 13.597999572753906 | 0.0 | 0.0 | 9.793059349060059 | 8.891343116760254 | 6.0 | 5.0 | 63.0 | 55.0 | 4 | 3 | 2 | 3 | 89 | 90 | 2021 | 2021 | False | False | ||||||||||||||||||||
| 43874.0 | 44269.0 | 46395.0 | 47401.0 | 46329.0 | 57550.0 | 51122.5 | 54603.0 | 7834.0 | 8269.0 | 12.214500427246094 | 14.964500427246094 | 0.0 | 0.0 | 6.119999885559082 | 4.553679466247559 | 100.0 | 0.0 | 66.0 | 71.0 | 9.6850004196167 | 12.234999656677246 | 0.0 | 0.0 | 5.411986351013184 | 5.091168403625488 | 100.0 | 0.0 | 65.0 | 54.0 | 14.029000282287598 | 14.878999710083008 | 0.0 | 0.0 | 2.5199999809265137 | 5.623379707336426 | 0.0 | 0.0 | 73.0 | 59.0 | 11.232500076293945 | 11.982500076293945 | 0.0 | 0.0 | 20.124610900878906 | 17.33989715576172 | 0.0 | 7.0 | 70.0 | 57.0 | 9.800000190734863 | 12.449999809265137 | 0.0 | 0.0 | 7.072877883911133 | 7.594207286834717 | 98.0 | 0.0 | 68.0 | 80.0 | 6.60099983215332 | 7.901000022888184 | 0.0 | 0.0 | 4.802999019622803 | 4.213691711425781 | 0.0 | 0.0 | 85.0 | 85.0 | 9.128000259399414 | 10.928000450134277 | 0.0 | 0.0 | 10.972620010375977 | 10.883675575256348 | 0.0 | 0.0 | 84.0 | 82.0 | 9.487000465393066 | 11.53700065612793 | 0.0 | 0.0 | 3.5999999046325684 | 6.119999885559082 | 18.0 | 20.0 | 63.0 | 81.0 | 10.628000259399414 | 9.128000259399414 | 0.0 | 0.0 | 14.986553192138672 | 8.647496223449707 | 100.0 | 16.0 | 78.0 | 82.0 | 11.697999954223633 | 12.79800033569336 | 0.0 | 0.10000000149011612 | 9.511088371276855 | 9.199390411376953 | 10.0 | 0.0 | 61.0 | 66.0 | 5 | 4 | 2 | 3 | 89 | 90 | 2021 | 2021 | False | False | ||||||||||||||||||||
| 46197.0 | 43874.0 | 44269.0 | 46395.0 | 49199.0 | 57188.0 | 51122.5 | 54325.0 | 8856.0 | 8278.0 | 11.764500617980957 | 14.56450080871582 | 0.0 | 0.0 | 5.315336227416992 | 4.452953815460205 | 57.0 | 0.0 | 68.0 | 72.0 | 9.135000228881836 | 11.734999656677246 | 0.0 | 0.0 | 5.399999618530273 | 5.091168403625488 | 61.0 | 0.0 | 66.0 | 56.0 | 13.979000091552734 | 14.779000282287598 | 0.0 | 0.0 | 3.617955207824707 | 6.130578994750977 | 5.0 | 0.0 | 72.0 | 60.0 | 10.982500076293945 | 11.882500648498535 | 0.0 | 0.0 | 20.150354385375977 | 17.61058807373047 | 0.0 | 5.0 | 71.0 | 57.0 | 9.550000190734863 | 12.149999618530273 | 0.0 | 0.0 | 6.989935398101807 | 7.2805495262146 | 100.0 | 0.0 | 71.0 | 81.0 | 6.200999736785889 | 7.60099983215332 | 0.0 | 0.0 | 4.452953815460205 | 4.213691711425781 | 0.0 | 0.0 | 83.0 | 83.0 | 8.928000450134277 | 10.628000259399414 | 0.0 | 0.0 | 11.18320083618164 | 11.304228782653809 | 6.0 | 5.0 | 85.0 | 83.0 | 8.937000274658203 | 11.13700008392334 | 0.0 | 0.0 | 4.33497428894043 | 6.839999675750732 | 19.0 | 20.0 | 65.0 | 81.0 | 10.428000450134277 | 8.928000450134277 | 0.0 | 0.0 | 15.46324634552002 | 8.640000343322754 | 14.0 | 67.0 | 77.0 | 85.0 | 11.39799976348877 | 12.89799976348877 | 0.0 | 0.0 | 8.714676856994629 | 9.65981388092041 | 13.0 | 100.0 | 61.0 | 66.0 | 6 | 5 | 2 | 3 | 89 | 90 | 2021 | 2021 | False | False | ||||||||||||||||||||
| 51913.0 | 46197.0 | 43874.0 | 44269.0 | 54881.0 | 60367.0 | 51122.5 | 54140.0 | 8856.0 | 8278.0 | 11.264500617980957 | 14.06450080871582 | 0.0 | 0.0 | 5.483356475830078 | 4.379589080810547 | 8.0 | 6.0 | 71.0 | 74.0 | 8.635000228881836 | 11.335000038146973 | 0.0 | 0.0 | 5.447788238525391 | 5.052840709686279 | 12.0 | 0.0 | 68.0 | 57.0 | 13.878999710083008 | 14.678999900817871 | 0.0 | 0.0 | 5.351784706115723 | 4.33497428894043 | 0.0 | 0.0 | 71.0 | 61.0 | 10.882500648498535 | 11.682499885559082 | 0.0 | 0.0 | 19.862083435058594 | 17.581125259399414 | 0.0 | 10.0 | 72.0 | 56.0 | 9.149999618530273 | 11.75 | 0.0 | 0.0 | 7.56856632232666 | 7.594207286834717 | 72.0 | 0.0 | 76.0 | 81.0 | 6.10099983215332 | 7.401000022888184 | 0.0 | 0.0 | 4.213691711425781 | 4.553679466247559 | 0.0 | 0.0 | 80.0 | 78.0 | 8.628000259399414 | 10.227999687194824 | 0.0 | 0.0 | 10.99032211303711 | 10.188699722290039 | 0.0 | 6.0 | 87.0 | 86.0 | 8.53700065612793 | 10.737000465393066 | 0.0 | 0.0 | 3.976329803466797 | 7.235910415649414 | 73.0 | 18.0 | 66.0 | 81.0 | 10.527999877929688 | 8.628000259399414 | 0.0 | 0.0 | 16.74677276611328 | 9.422101020812988 | 19.0 | 59.0 | 76.0 | 89.0 | 11.597999572753906 | 12.697999954223633 | 0.0 | 0.0 | 8.20926284790039 | 8.891343116760254 | 76.0 | 51.0 | 60.0 | 64.0 | 7 | 6 | 2 | 3 | 89 | 90 | 2021 | 2021 | False | False | ||||||||||||||||||||
| 41584.0 | 43226.0 | 44004.0 | 43246.0 | 40135.0 | 42773.0 | 41951.5 | 40323.5 | 6217.0 | 7202.0 | 27.915000915527344 | 24.26500129699707 | 0.0 | 0.10000000149011612 | 2.1600000858306885 | 9.007196426391602 | 66.0 | 100.0 | 0.2680000066757202 | 0.2680000066757202 | 41.0 | 60.0 | 27.56100082397461 | 21.861000061035156 | 0.0 | 0.0 | 0.8049845099449158 | 3.545588731765747 | 9.0 | 100.0 | 0.26899999380111694 | 0.27399998903274536 | 41.0 | 64.0 | 19.567001342773438 | 19.91699981689453 | 0.0 | 0.0 | 7.9932966232299805 | 10.464797019958496 | 0.0 | 0.0 | 0.14100000262260437 | 0.13899999856948853 | 86.0 | 87.0 | 27.770000457763672 | 29.32000160217285 | 0.0 | 0.0 | 5.86037540435791 | 6.989935874938965 | 48.0 | 7.0 | 0.21299999952316284 | 0.20800000429153442 | 37.0 | 45.0 | 21.08049964904785 | 24.030498504638672 | 0.0 | 0.0 | 6.479999542236328 | 14.399999618530273 | 99.0 | 100.0 | 0.2750000059604645 | 0.2669999897480011 | 72.0 | 59.0 | 26.923500061035156 | 25.62350082397461 | 0.0 | 0.0 | 3.096837043762207 | 10.594036102294922 | 61.0 | 19.0 | 0.16699999570846558 | 0.15700000524520874 | 48.0 | 53.0 | 25.786998748779297 | 21.73699951171875 | 0.0 | 0.0 | 6.8777899742126465 | 12.245292663574219 | 0.0 | 100.0 | 0.18799999356269836 | 0.17299999296665192 | 54.0 | 54.0 | 25.39349937438965 | 19.89349937438965 | 0.0 | 0.0 | 4.104631423950195 | 10.685391426086426 | 0.0 | 84.0 | 0.2639999985694885 | 0.3149999976158142 | 53.0 | 79.0 | 16.567501068115234 | 17.517499923706055 | 0.0 | 0.0 | 9.957108497619629 | 12.0693998336792 | 0.0 | 5.0 | 0.16200000047683716 | 0.1679999977350235 | 75.0 | 73.0 | 22.726499557495117 | 18.826499938964844 | 0.20000000298023224 | 0.0 | 9.422101020812988 | 12.096214294433594 | 100.0 | 100.0 | 0.19599999487400055 | 0.23800000548362732 | 67.0 | 83.0 | 22 | 21 | 5 | 6 | 150 | 151 | 2025 | 2025 | False | False |
| 42931.0 | 41584.0 | 43226.0 | 44004.0 | 41362.0 | 44204.0 | 42382.0 | 40323.5 | 6217.0 | 7196.0 | 26.165000915527344 | 23.364999771118164 | 0.0 | 0.0 | 3.2599384784698486 | 9.605998039245605 | 48.0 | 100.0 | 0.26899999380111694 | 0.26899999380111694 | 55.0 | 62.0 | 25.961000442504883 | 21.56100082397461 | 0.0 | 0.0 | 1.4843180179595947 | 2.545584201812744 | 38.0 | 80.0 | 0.27000001072883606 | 0.27300000190734863 | 51.0 | 72.0 | 19.767000198364258 | 19.66699981689453 | 0.0 | 0.0 | 6.287129878997803 | 6.618519306182861 | 0.0 | 100.0 | 0.14000000059604645 | 0.13899999856948853 | 82.0 | 87.0 | 25.82000160217285 | 27.3700008392334 | 0.0 | 0.0 | 4.072935104370117 | 5.399999618530273 | 89.0 | 100.0 | 0.2150000035762787 | 0.20800000429153442 | 42.0 | 54.0 | 20.58049964904785 | 21.430500030517578 | 0.0 | 0.0 | 6.839999675750732 | 12.959999084472656 | 100.0 | 100.0 | 0.2750000059604645 | 0.2669999897480011 | 74.0 | 61.0 | 24.37350082397461 | 23.323501586914062 | 0.0 | 0.0 | 2.9024126529693604 | 8.654986381530762 | 84.0 | 23.0 | 0.16599999368190765 | 0.15600000321865082 | 66.0 | 66.0 | 24.886999130249023 | 20.437000274658203 | 0.0 | 0.0 | 4.829906940460205 | 10.308830261230469 | 100.0 | 100.0 | 0.18799999356269836 | 0.17299999296665192 | 55.0 | 58.0 | 23.443500518798828 | 20.14349937438965 | 0.0 | 0.0 | 3.3190360069274902 | 6.287129878997803 | 0.0 | 94.0 | 0.2639999985694885 | 0.3059999942779541 | 66.0 | 75.0 | 15.717499732971191 | 16.267499923706055 | 0.0 | 0.0 | 8.55710220336914 | 9.114471435546875 | 100.0 | 99.0 | 0.16200000047683716 | 0.1679999977350235 | 81.0 | 77.0 | 21.026498794555664 | 18.476499557495117 | 0.0 | 0.0 | 4.0249223709106445 | 11.631956100463867 | 100.0 | 100.0 | 0.1979999989271164 | 0.23899999260902405 | 75.0 | 84.0 | 23 | 22 | 5 | 6 | 150 | 151 | 2025 | 2025 | False | False |
| 43812.0 | 42931.0 | 41584.0 | 43226.0 | 42722.0 | 45021.0 | 42382.0 | 40323.5 | 6288.0 | 7181.0 | 22.614999771118164 | 22.46500015258789 | 0.0 | 0.0 | 11.090103149414062 | 13.854154586791992 | 0.0 | 100.0 | 0.27000001072883606 | 0.2709999978542328 | 68.0 | 65.0 | 23.31100082397461 | 21.111000061035156 | 0.0 | 0.0 | 1.8356469869613647 | 2.545584201812744 | 0.0 | 71.0 | 0.2709999978542328 | 0.27300000190734863 | 66.0 | 76.0 | 19.91699981689453 | 17.91699981689453 | 0.0 | 0.0 | 5.483356475830078 | 7.386581897735596 | 14.0 | 100.0 | 0.14000000059604645 | 0.13899999856948853 | 85.0 | 96.0 | 24.470001220703125 | 26.020000457763672 | 0.0 | 0.0 | 4.213691711425781 | 4.6102495193481445 | 61.0 | 6.0 | 0.2160000056028366 | 0.20999999344348907 | 45.0 | 59.0 | 19.6304988861084 | 21.08049964904785 | 0.0 | 0.0 | 5.759999752044678 | 15.480000495910645 | 100.0 | 100.0 | 0.2750000059604645 | 0.2669999897480011 | 74.0 | 61.0 | 22.773500442504883 | 21.62350082397461 | 0.0 | 0.0 | 1.7999999523162842 | 7.636752605438232 | 100.0 | 100.0 | 0.16699999570846558 | 0.1550000011920929 | 71.0 | 73.0 | 24.187000274658203 | 19.336999893188477 | 0.0 | 0.0 | 8.404284477233887 | 6.489992141723633 | 17.0 | 100.0 | 0.18799999356269836 | 0.17399999499320984 | 56.0 | 63.0 | 22.193500518798828 | 19.443500518798828 | 0.0 | 0.0 | 2.545584201812744 | 4.349896430969238 | 0.0 | 94.0 | 0.2639999985694885 | 0.30300000309944153 | 73.0 | 78.0 | 15.217499732971191 | 15.517499923706055 | 0.0 | 0.0 | 8.89134407043457 | 7.559999465942383 | 3.0 | 93.0 | 0.16300000250339508 | 0.16899999976158142 | 87.0 | 84.0 | 22.12649917602539 | 18.226499557495117 | 0.0 | 0.0 | 11.384199142456055 | 10.144082069396973 | 100.0 | 100.0 | 0.19900000095367432 | 0.23899999260902405 | 67.0 | 84.0 | 0 | 23 | 6 | 6 | 151 | 151 | 2025 | 2025 | False | False |
| 41966.0 | 43812.0 | 42931.0 | 41584.0 | 41152.0 | 43402.0 | 42382.0 | 40323.5 | 6288.0 | 7181.0 | 21.065000534057617 | 20.96500015258789 | 0.0 | 0.0 | 7.771330833435059 | 9.6932954788208 | 0.0 | 95.0 | 0.27000001072883606 | 0.2709999978542328 | 73.0 | 63.0 | 22.161001205444336 | 20.661001205444336 | 0.0 | 0.0 | 1.2979984283447266 | 3.617955207824707 | 0.0 | 12.0 | 0.2720000147819519 | 0.27300000190734863 | 69.0 | 80.0 | 19.66699981689453 | 17.567001342773438 | 0.0 | 0.0 | 6.119999885559082 | 4.6938252449035645 | 39.0 | 100.0 | 0.14000000059604645 | 0.13899999856948853 | 85.0 | 96.0 | 23.470001220703125 | 24.57000160217285 | 0.0 | 0.0 | 3.7064266204833984 | 3.396233081817627 | 11.0 | 100.0 | 0.21799999475479126 | 0.210999995470047 | 47.0 | 71.0 | 19.030498504638672 | 18.58049964904785 | 0.0 | 0.0 | 6.839999675750732 | 10.440000534057617 | 100.0 | 38.0 | 0.2759999930858612 | 0.2669999897480011 | 80.0 | 68.0 | 21.023500442504883 | 20.273500442504883 | 0.0 | 0.0 | 2.9024126529693604 | 5.399999618530273 | 100.0 | 100.0 | 0.16699999570846558 | 0.1550000011920929 | 78.0 | 81.0 | 22.687000274658203 | 18.48699951171875 | 0.0 | 0.0 | 4.349896430969238 | 6.725354194641113 | 100.0 | 100.0 | 0.18799999356269836 | 0.17399999499320984 | 63.0 | 75.0 | 20.943500518798828 | 19.39349937438965 | 0.0 | 0.0 | 3.219938039779663 | 3.096837043762207 | 0.0 | 100.0 | 0.26499998569488525 | 0.30000001192092896 | 80.0 | 80.0 | 14.917499542236328 | 15.567500114440918 | 0.0 | 0.0 | 8.66994857788086 | 9.0 | 0.0 | 100.0 | 0.16300000250339508 | 0.17000000178813934 | 87.0 | 82.0 | 21.426498413085938 | 17.576499938964844 | 0.0 | 0.0 | 11.96695327758789 | 4.452953815460205 | 100.0 | 100.0 | 0.20100000500679016 | 0.23999999463558197 | 71.0 | 90.0 | 1 | 0 | 6 | 7 | 151 | 152 | 2025 | 2025 | False | False |
| 38248.0 | 41966.0 | 43812.0 | 42931.0 | 37524.0 | 39496.0 | 42382.0 | 40323.5 | 5564.0 | 7181.0 | 20.315000534057617 | 19.96500015258789 | 0.0 | 0.0 | 7.208993911743164 | 10.308831214904785 | 41.0 | 90.0 | 0.2709999978542328 | 0.2720000147819519 | 73.0 | 64.0 | 20.56100082397461 | 19.961000442504883 | 0.0 | 0.0 | 2.2768397331237793 | 3.2599384784698486 | 100.0 | 63.0 | 0.2720000147819519 | 0.27300000190734863 | 74.0 | 83.0 | 18.66699981689453 | 17.567001342773438 | 0.0 | 0.0 | 6.915374279022217 | 2.595996856689453 | 100.0 | 100.0 | 0.14000000059604645 | 0.13899999856948853 | 93.0 | 96.0 | 21.270000457763672 | 23.470001220703125 | 0.0 | 0.0 | 0.35999998450279236 | 4.0249223709106445 | 53.0 | 100.0 | 0.2199999988079071 | 0.21299999952316284 | 66.0 | 80.0 | 17.430500030517578 | 16.730499267578125 | 0.0 | 0.0 | 6.839999675750732 | 10.799999237060547 | 100.0 | 18.0 | 0.2759999930858612 | 0.2669999897480011 | 77.0 | 78.0 | 20.473501205444336 | 19.223501205444336 | 0.0 | 0.0 | 3.096837043762207 | 4.735060214996338 | 100.0 | 16.0 | 0.1679999977350235 | 0.1550000011920929 | 79.0 | 87.0 | 21.23699951171875 | 17.73699951171875 | 0.0 | 0.0 | 7.862518310546875 | 8.55710220336914 | 100.0 | 72.0 | 0.17800000309944153 | 0.17499999701976776 | 66.0 | 81.0 | 19.543498992919922 | 19.293498992919922 | 0.0 | 0.0 | 3.9600000381469727 | 3.617955207824707 | 0.0 | 100.0 | 0.26600000262260437 | 0.296999990940094 | 87.0 | 81.0 | 13.517499923706055 | 15.567500114440918 | 0.0 | 0.0 | 7.594207286834717 | 5.506940841674805 | 6.0 | 100.0 | 0.16699999570846558 | 0.17100000381469727 | 93.0 | 81.0 | 20.576499938964844 | 18.226499557495117 | 0.0 | 0.0 | 2.4149534702301025 | 8.089993476867676 | 100.0 | 100.0 | 0.2619999945163727 | 0.23999999463558197 | 75.0 | 83.0 | 2 | 1 | 6 | 7 | 151 | 152 | 2025 | 2025 | False | False |
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_2h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_3h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1d
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,275 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_lag_1w
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,283 (63.7%)
- Mean ± Std
- 4.99e+04 ± 1.05e+04
- Median ± IQR
- 4.82e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_rolling_median_24h
Float64- Null values
- 0 (0.0%)
- Unique values
- 9,599 (26.3%)
- Mean ± Std
- 5.05e+04 ± 9.29e+03
- Median ± IQR
- 4.74e+04 ± 1.29e+04
- Min | Max
- 3.37e+04 | 7.84e+04
load_mw_rolling_median_7d
Float64- Null values
- 0 (0.0%)
- Unique values
- 7,137 (19.5%)
- Mean ± Std
- 5.01e+04 ± 8.81e+03
- Median ± IQR
- 4.60e+04 ± 1.35e+04
- Min | Max
- 3.85e+04 | 7.39e+04
load_mw_iqr_24h
Float64- Null values
- 0 (0.0%)
- Unique values
- 5,908 (16.2%)
- Mean ± Std
- 6.52e+03 ± 1.56e+03
- Median ± IQR
- 6.43e+03 ± 2.05e+03
- Min | Max
- 2.32e+03 | 1.60e+04
load_mw_iqr_7d
Float64- Null values
- 0 (0.0%)
- Unique values
- 5,327 (14.6%)
- Mean ± Std
- 8.30e+03 ± 1.41e+03
- Median ± IQR
- 8.28e+03 ± 1.63e+03
- Min | Max
- 5.04e+03 | 1.86e+04
weather_temperature_2m_paris_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,439 (3.9%)
- Mean ± Std
- 13.6 ± 7.00
- Median ± IQR
- 13.2 ± 9.77
- Min | Max
- -5.13 | 40.6
weather_temperature_2m_paris_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,439 (3.9%)
- Mean ± Std
- 13.6 ± 7.00
- Median ± IQR
- 13.2 ± 9.77
- Min | Max
- -5.13 | 40.6
weather_precipitation_paris_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 135 (0.4%)
- Mean ± Std
- 0.0914 ± 0.543
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 29.5
weather_precipitation_paris_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 135 (0.4%)
- Mean ± Std
- 0.0914 ± 0.543
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 29.5
weather_wind_speed_10m_paris_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,774 (4.9%)
- Mean ± Std
- 10.0 ± 5.28
- Median ± IQR
- 9.29 ± 7.20
- Min | Max
- 0.00 | 50.1
weather_wind_speed_10m_paris_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,774 (4.9%)
- Mean ± Std
- 10.0 ± 5.28
- Median ± IQR
- 9.34 ± 7.20
- Min | Max
- 0.00 | 50.1
weather_cloud_cover_paris_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.2 ± 39.9
- Median ± IQR
- 97.0 ± 71.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_paris_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.3 ± 39.9
- Median ± IQR
- 97.0 ± 71.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_paris_future_1h
Float32- Null values
- 14,245 (39.0%)
- Unique values
- 277 (0.8%)
- Mean ± Std
- 0.298 ± 0.0397
- Median ± IQR
- 0.304 ± 0.0440
- Min | Max
- 0.139 | 0.436
weather_soil_moisture_1_to_3cm_paris_future_24h
Float32- Null values
- 14,222 (38.9%)
- Unique values
- 277 (0.8%)
- Mean ± Std
- 0.298 ± 0.0397
- Median ± IQR
- 0.304 ± 0.0440
- Min | Max
- 0.139 | 0.436
weather_relative_humidity_2m_paris_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 69.7 ± 18.1
- Median ± IQR
- 73.0 ± 27.0
- Min | Max
- 10.0 | 100.
weather_relative_humidity_2m_paris_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 69.7 ± 18.1
- Median ± IQR
- 73.0 ± 27.0
- Min | Max
- 10.0 | 100.
weather_temperature_2m_lyon_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,565 (4.3%)
- Mean ± Std
- 14.1 ± 7.97
- Median ± IQR
- 13.8 ± 11.4
- Min | Max
- -5.89 | 40.3
weather_temperature_2m_lyon_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,565 (4.3%)
- Mean ± Std
- 14.1 ± 7.97
- Median ± IQR
- 13.8 ± 11.4
- Min | Max
- -5.89 | 40.3
weather_precipitation_lyon_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 150 (0.4%)
- Mean ± Std
- 0.0993 ± 0.609
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 26.3
weather_precipitation_lyon_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 150 (0.4%)
- Mean ± Std
- 0.0993 ± 0.609
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 26.3
weather_wind_speed_10m_lyon_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,727 (4.7%)
- Mean ± Std
- 8.08 ± 6.05
- Median ± IQR
- 6.48 ± 7.67
- Min | Max
- 0.00 | 43.2
weather_wind_speed_10m_lyon_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,727 (4.7%)
- Mean ± Std
- 8.08 ± 6.05
- Median ± IQR
- 6.48 ± 7.67
- Min | Max
- 0.00 | 43.2
weather_cloud_cover_lyon_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 64.5 ± 41.8
- Median ± IQR
- 92.0 ± 88.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_lyon_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 64.6 ± 41.8
- Median ± IQR
- 92.0 ± 88.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_lyon_future_1h
Float32- Null values
- 14,245 (39.0%)
- Unique values
- 290 (0.8%)
- Mean ± Std
- 0.296 ± 0.0380
- Median ± IQR
- 0.304 ± 0.0320
- Min | Max
- 0.124 | 0.441
weather_soil_moisture_1_to_3cm_lyon_future_24h
Float32- Null values
- 14,222 (38.9%)
- Unique values
- 290 (0.8%)
- Mean ± Std
- 0.296 ± 0.0380
- Median ± IQR
- 0.304 ± 0.0330
- Min | Max
- 0.124 | 0.441
weather_relative_humidity_2m_lyon_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 89 (0.2%)
- Mean ± Std
- 68.6 ± 18.7
- Median ± IQR
- 71.0 ± 28.0
- Min | Max
- 12.0 | 100.
weather_relative_humidity_2m_lyon_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 89 (0.2%)
- Mean ± Std
- 68.6 ± 18.7
- Median ± IQR
- 71.0 ± 28.0
- Min | Max
- 12.0 | 100.
weather_temperature_2m_marseille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,276 (3.5%)
- Mean ± Std
- 17.5 ± 6.15
- Median ± IQR
- 17.1 ± 9.75
- Min | Max
- 0.317 | 36.6
weather_temperature_2m_marseille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,276 (3.5%)
- Mean ± Std
- 17.5 ± 6.15
- Median ± IQR
- 17.1 ± 9.75
- Min | Max
- 0.317 | 36.6
weather_precipitation_marseille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 0.0514 ± 0.382
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 21.0
weather_precipitation_marseille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 0.0514 ± 0.382
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 21.0
weather_wind_speed_10m_marseille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 4,018 (11.0%)
- Mean ± Std
- 14.9 ± 10.8
- Median ± IQR
- 11.8 ± 12.3
- Min | Max
- 0.00 | 74.6
weather_wind_speed_10m_marseille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 4,018 (11.0%)
- Mean ± Std
- 14.9 ± 10.8
- Median ± IQR
- 11.8 ± 12.3
- Min | Max
- 0.00 | 74.6
weather_cloud_cover_marseille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 46.6 ± 44.5
- Median ± IQR
- 31.0 ± 100.
- Min | Max
- -1.00 | 101.
weather_cloud_cover_marseille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 46.7 ± 44.5
- Median ± IQR
- 32.0 ± 100.
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_marseille_future_1h
Float32- Null values
- 14,311 (39.2%)
- Unique values
- 354 (1.0%)
- Mean ± Std
- 0.227 ± 0.0748
- Median ± IQR
- 0.223 ± 0.119
- Min | Max
- 0.100 | 0.459
weather_soil_moisture_1_to_3cm_marseille_future_24h
Float32- Null values
- 14,288 (39.1%)
- Unique values
- 354 (1.0%)
- Mean ± Std
- 0.226 ± 0.0748
- Median ± IQR
- 0.223 ± 0.119
- Min | Max
- 0.100 | 0.459
weather_relative_humidity_2m_marseille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 86 (0.2%)
- Mean ± Std
- 63.4 ± 13.2
- Median ± IQR
- 64.0 ± 19.0
- Min | Max
- 14.0 | 99.0
weather_relative_humidity_2m_marseille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 86 (0.2%)
- Mean ± Std
- 63.4 ± 13.2
- Median ± IQR
- 64.0 ± 19.0
- Min | Max
- 14.0 | 99.0
weather_temperature_2m_toulouse_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,513 (4.1%)
- Mean ± Std
- 15.2 ± 7.48
- Median ± IQR
- 14.6 ± 10.5
- Min | Max
- -5.33 | 41.2
weather_temperature_2m_toulouse_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,513 (4.1%)
- Mean ± Std
- 15.2 ± 7.49
- Median ± IQR
- 14.6 ± 10.5
- Min | Max
- -5.33 | 41.2
weather_precipitation_toulouse_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 121 (0.3%)
- Mean ± Std
- 0.0740 ± 0.587
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 36.9
weather_precipitation_toulouse_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 121 (0.3%)
- Mean ± Std
- 0.0740 ± 0.587
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 36.9
weather_wind_speed_10m_toulouse_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,123 (5.8%)
- Mean ± Std
- 9.88 ± 6.48
- Median ± IQR
- 8.65 ± 8.61
- Min | Max
- 0.00 | 44.6
weather_wind_speed_10m_toulouse_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,123 (5.8%)
- Mean ± Std
- 9.87 ± 6.48
- Median ± IQR
- 8.65 ± 8.57
- Min | Max
- 0.00 | 44.6
weather_cloud_cover_toulouse_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 62.2 ± 42.1
- Median ± IQR
- 87.0 ± 90.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_toulouse_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 62.2 ± 42.1
- Median ± IQR
- 87.0 ± 90.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_toulouse_future_1h
Float32- Null values
- 14,245 (39.0%)
- Unique values
- 310 (0.8%)
- Mean ± Std
- 0.271 ± 0.0505
- Median ± IQR
- 0.285 ± 0.0530
- Min | Max
- 0.104 | 0.454
weather_soil_moisture_1_to_3cm_toulouse_future_24h
Float32- Null values
- 14,222 (38.9%)
- Unique values
- 310 (0.8%)
- Mean ± Std
- 0.271 ± 0.0505
- Median ± IQR
- 0.285 ± 0.0540
- Min | Max
- 0.104 | 0.454
weather_relative_humidity_2m_toulouse_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 69.7 ± 18.7
- Median ± IQR
- 73.0 ± 29.0
- Min | Max
- 8.00 | 100.
weather_relative_humidity_2m_toulouse_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 69.7 ± 18.7
- Median ± IQR
- 73.0 ± 29.0
- Min | Max
- 8.00 | 100.
weather_temperature_2m_lille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,080 (5.7%)
- Mean ± Std
- 12.2 ± 6.58
- Median ± IQR
- 11.8 ± 9.05
- Min | Max
- -6.32 | 40.8
weather_temperature_2m_lille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,081 (5.7%)
- Mean ± Std
- 12.2 ± 6.58
- Median ± IQR
- 11.8 ± 9.05
- Min | Max
- -6.32 | 40.8
weather_precipitation_lille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 75 (0.2%)
- Mean ± Std
- 0.0977 ± 0.418
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.7
weather_precipitation_lille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 75 (0.2%)
- Mean ± Std
- 0.0977 ± 0.418
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.7
weather_wind_speed_10m_lille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,532 (6.9%)
- Mean ± Std
- 12.9 ± 6.60
- Median ± IQR
- 11.7 ± 8.47
- Min | Max
- 0.00 | 61.9
weather_wind_speed_10m_lille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,532 (6.9%)
- Mean ± Std
- 12.9 ± 6.60
- Median ± IQR
- 11.7 ± 8.47
- Min | Max
- 0.00 | 61.9
weather_cloud_cover_lille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 67.5 ± 40.4
- Median ± IQR
- 96.0 ± 78.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_lille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 67.6 ± 40.4
- Median ± IQR
- 96.0 ± 78.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_lille_future_1h
Float32- Null values
- 14,245 (39.0%)
- Unique values
- 209 (0.6%)
- Mean ± Std
- 0.306 ± 0.0315
- Median ± IQR
- 0.311 ± 0.0390
- Min | Max
- 0.203 | 0.422
weather_soil_moisture_1_to_3cm_lille_future_24h
Float32- Null values
- 14,222 (38.9%)
- Unique values
- 209 (0.6%)
- Mean ± Std
- 0.306 ± 0.0315
- Median ± IQR
- 0.311 ± 0.0400
- Min | Max
- 0.203 | 0.422
weather_relative_humidity_2m_lille_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 96 (0.3%)
- Mean ± Std
- 74.7 ± 17.1
- Median ± IQR
- 79.0 ± 24.0
- Min | Max
- 0.00 | 100.
weather_relative_humidity_2m_lille_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 96 (0.3%)
- Mean ± Std
- 74.7 ± 17.1
- Median ± IQR
- 79.0 ± 24.0
- Min | Max
- 0.00 | 100.
weather_temperature_2m_limoges_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,572 (4.3%)
- Mean ± Std
- 12.7 ± 7.35
- Median ± IQR
- 12.1 ± 9.77
- Min | Max
- -7.70 | 39.7
weather_temperature_2m_limoges_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,572 (4.3%)
- Mean ± Std
- 12.7 ± 7.35
- Median ± IQR
- 12.1 ± 9.80
- Min | Max
- -7.70 | 39.7
weather_precipitation_limoges_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 153 (0.4%)
- Mean ± Std
- 0.123 ± 0.623
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 45.5
weather_precipitation_limoges_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 153 (0.4%)
- Mean ± Std
- 0.123 ± 0.623
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 45.5
weather_wind_speed_10m_limoges_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,359 (3.7%)
- Mean ± Std
- 7.58 ± 4.77
- Median ± IQR
- 6.52 ± 6.93
- Min | Max
- 0.00 | 33.9
weather_wind_speed_10m_limoges_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,359 (3.7%)
- Mean ± Std
- 7.58 ± 4.77
- Median ± IQR
- 6.52 ± 6.93
- Min | Max
- 0.00 | 33.9
weather_cloud_cover_limoges_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.6 ± 40.8
- Median ± IQR
- 93.0 ± 81.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_limoges_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.6 ± 40.8
- Median ± IQR
- 93.0 ± 81.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_limoges_future_1h
Float32- Null values
- 14,245 (39.0%)
- Unique values
- 302 (0.8%)
- Mean ± Std
- 0.283 ± 0.0553
- Median ± IQR
- 0.298 ± 0.0650
- Min | Max
- 0.115 | 0.450
weather_soil_moisture_1_to_3cm_limoges_future_24h
Float32- Null values
- 14,222 (38.9%)
- Unique values
- 302 (0.8%)
- Mean ± Std
- 0.282 ± 0.0554
- Median ± IQR
- 0.298 ± 0.0650
- Min | Max
- 0.115 | 0.450
weather_relative_humidity_2m_limoges_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 75.2 ± 19.9
- Median ± IQR
- 81.0 ± 29.0
- Min | Max
- 8.00 | 100.
weather_relative_humidity_2m_limoges_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 93 (0.3%)
- Mean ± Std
- 75.2 ± 19.8
- Median ± IQR
- 81.0 ± 29.0
- Min | Max
- 8.00 | 100.
weather_temperature_2m_nantes_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,539 (4.2%)
- Mean ± Std
- 13.8 ± 6.65
- Median ± IQR
- 13.4 ± 8.50
- Min | Max
- -3.86 | 43.4
weather_temperature_2m_nantes_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,539 (4.2%)
- Mean ± Std
- 13.8 ± 6.65
- Median ± IQR
- 13.4 ± 8.55
- Min | Max
- -3.86 | 43.4
weather_precipitation_nantes_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 112 (0.3%)
- Mean ± Std
- 0.0870 ± 0.437
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.1
weather_precipitation_nantes_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 112 (0.3%)
- Mean ± Std
- 0.0870 ± 0.437
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 14.1
weather_wind_speed_10m_nantes_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,833 (7.8%)
- Mean ± Std
- 13.4 ± 6.91
- Median ± IQR
- 12.0 ± 8.37
- Min | Max
- 0.00 | 58.6
weather_wind_speed_10m_nantes_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,833 (7.8%)
- Mean ± Std
- 13.4 ± 6.92
- Median ± IQR
- 12.0 ± 8.37
- Min | Max
- 0.00 | 58.6
weather_cloud_cover_nantes_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 65.1 ± 41.3
- Median ± IQR
- 94.0 ± 84.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_nantes_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 65.2 ± 41.3
- Median ± IQR
- 94.0 ± 84.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_nantes_future_1h
Float32- Null values
- 14,311 (39.2%)
- Unique values
- 314 (0.9%)
- Mean ± Std
- 0.276 ± 0.0658
- Median ± IQR
- 0.295 ± 0.0840
- Min | Max
- 0.110 | 0.423
weather_soil_moisture_1_to_3cm_nantes_future_24h
Float32- Null values
- 14,288 (39.1%)
- Unique values
- 314 (0.9%)
- Mean ± Std
- 0.276 ± 0.0658
- Median ± IQR
- 0.295 ± 0.0840
- Min | Max
- 0.110 | 0.423
weather_relative_humidity_2m_nantes_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 94 (0.3%)
- Mean ± Std
- 74.0 ± 17.3
- Median ± IQR
- 78.0 ± 25.0
- Min | Max
- 7.00 | 100.
weather_relative_humidity_2m_nantes_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 94 (0.3%)
- Mean ± Std
- 74.0 ± 17.3
- Median ± IQR
- 78.0 ± 25.0
- Min | Max
- 7.00 | 100.
weather_temperature_2m_strasbourg_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,525 (4.2%)
- Mean ± Std
- 12.7 ± 7.74
- Median ± IQR
- 12.3 ± 11.0
- Min | Max
- -9.31 | 38.8
weather_temperature_2m_strasbourg_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,525 (4.2%)
- Mean ± Std
- 12.7 ± 7.75
- Median ± IQR
- 12.3 ± 11.0
- Min | Max
- -9.31 | 38.8
weather_precipitation_strasbourg_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 127 (0.3%)
- Mean ± Std
- 0.102 ± 0.510
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 22.1
weather_precipitation_strasbourg_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 128 (0.4%)
- Mean ± Std
- 0.102 ± 0.517
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 22.1
weather_wind_speed_10m_strasbourg_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,520 (4.2%)
- Mean ± Std
- 8.45 ± 5.05
- Median ± IQR
- 7.52 ± 6.94
- Min | Max
- 0.00 | 38.1
weather_wind_speed_10m_strasbourg_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,520 (4.2%)
- Mean ± Std
- 8.46 ± 5.05
- Median ± IQR
- 7.52 ± 6.92
- Min | Max
- 0.00 | 38.1
weather_cloud_cover_strasbourg_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.7 ± 40.2
- Median ± IQR
- 98.0 ± 72.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_strasbourg_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 69.8 ± 40.2
- Median ± IQR
- 98.0 ± 72.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_strasbourg_future_1h
Float32- Null values
- 14,245 (39.0%)
- Unique values
- 304 (0.8%)
- Mean ± Std
- 0.329 ± 0.0519
- Median ± IQR
- 0.343 ± 0.0530
- Min | Max
- 0.159 | 0.468
weather_soil_moisture_1_to_3cm_strasbourg_future_24h
Float32- Null values
- 14,222 (38.9%)
- Unique values
- 304 (0.8%)
- Mean ± Std
- 0.329 ± 0.0519
- Median ± IQR
- 0.343 ± 0.0530
- Min | Max
- 0.159 | 0.468
weather_relative_humidity_2m_strasbourg_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 88 (0.2%)
- Mean ± Std
- 71.9 ± 18.5
- Median ± IQR
- 75.0 ± 28.0
- Min | Max
- 13.0 | 100.
weather_relative_humidity_2m_strasbourg_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 88 (0.2%)
- Mean ± Std
- 71.9 ± 18.5
- Median ± IQR
- 75.0 ± 28.0
- Min | Max
- 13.0 | 100.
weather_temperature_2m_brest_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,265 (3.5%)
- Mean ± Std
- 13.0 ± 4.89
- Median ± IQR
- 12.6 ± 6.20
- Min | Max
- -2.33 | 40.5
weather_temperature_2m_brest_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,265 (3.5%)
- Mean ± Std
- 13.0 ± 4.89
- Median ± IQR
- 12.6 ± 6.21
- Min | Max
- -2.33 | 40.5
weather_precipitation_brest_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 108 (0.3%)
- Mean ± Std
- 0.107 ± 0.432
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 12.7
weather_precipitation_brest_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 108 (0.3%)
- Mean ± Std
- 0.107 ± 0.432
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 12.7
weather_wind_speed_10m_brest_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 3,776 (10.3%)
- Mean ± Std
- 16.2 ± 8.89
- Median ± IQR
- 14.5 ± 11.8
- Min | Max
- 0.00 | 67.3
weather_wind_speed_10m_brest_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 3,776 (10.3%)
- Mean ± Std
- 16.2 ± 8.89
- Median ± IQR
- 14.5 ± 11.9
- Min | Max
- 0.00 | 67.3
weather_cloud_cover_brest_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 102 (0.3%)
- Mean ± Std
- 67.9 ± 39.8
- Median ± IQR
- 96.0 ± 75.0
- Min | Max
- 0.00 | 101.
weather_cloud_cover_brest_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 102 (0.3%)
- Mean ± Std
- 68.0 ± 39.8
- Median ± IQR
- 96.0 ± 74.0
- Min | Max
- 0.00 | 101.
weather_soil_moisture_1_to_3cm_brest_future_1h
Float32- Null values
- 14,311 (39.2%)
- Unique values
- 279 (0.8%)
- Mean ± Std
- 0.267 ± 0.0571
- Median ± IQR
- 0.278 ± 0.0740
- Min | Max
- 0.116 | 0.409
weather_soil_moisture_1_to_3cm_brest_future_24h
Float32- Null values
- 14,288 (39.1%)
- Unique values
- 279 (0.8%)
- Mean ± Std
- 0.266 ± 0.0572
- Median ± IQR
- 0.277 ± 0.0740
- Min | Max
- 0.116 | 0.409
weather_relative_humidity_2m_brest_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 90 (0.2%)
- Mean ± Std
- 78.2 ± 13.9
- Median ± IQR
- 81.0 ± 20.0
- Min | Max
- 10.0 | 100.
weather_relative_humidity_2m_brest_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 90 (0.2%)
- Mean ± Std
- 78.2 ± 13.9
- Median ± IQR
- 81.0 ± 20.0
- Min | Max
- 10.0 | 100.
weather_temperature_2m_bayonne_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,554 (4.3%)
- Mean ± Std
- 15.0 ± 6.40
- Median ± IQR
- 14.9 ± 8.47
- Min | Max
- -3.32 | 42.4
weather_temperature_2m_bayonne_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 1,554 (4.3%)
- Mean ± Std
- 15.0 ± 6.40
- Median ± IQR
- 14.9 ± 8.50
- Min | Max
- -3.32 | 42.4
weather_precipitation_bayonne_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 131 (0.4%)
- Mean ± Std
- 0.145 ± 0.553
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 18.5
weather_precipitation_bayonne_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 131 (0.4%)
- Mean ± Std
- 0.145 ± 0.553
- Median ± IQR
- 0.00 ± 0.00
- Min | Max
- 0.00 | 18.5
weather_wind_speed_10m_bayonne_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,488 (6.8%)
- Mean ± Std
- 10.9 ± 6.72
- Median ± IQR
- 9.36 ± 8.11
- Min | Max
- 0.00 | 51.5
weather_wind_speed_10m_bayonne_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 2,488 (6.8%)
- Mean ± Std
- 10.9 ± 6.72
- Median ± IQR
- 9.36 ± 8.11
- Min | Max
- 0.00 | 51.5
weather_cloud_cover_bayonne_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.4 ± 40.7
- Median ± IQR
- 95.0 ± 80.0
- Min | Max
- -1.00 | 101.
weather_cloud_cover_bayonne_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 103 (0.3%)
- Mean ± Std
- 66.5 ± 40.7
- Median ± IQR
- 95.0 ± 80.0
- Min | Max
- -1.00 | 101.
weather_soil_moisture_1_to_3cm_bayonne_future_1h
Float32- Null values
- 14,311 (39.2%)
- Unique values
- 299 (0.8%)
- Mean ± Std
- 0.276 ± 0.0510
- Median ± IQR
- 0.284 ± 0.0470
- Min | Max
- 0.0970 | 0.414
weather_soil_moisture_1_to_3cm_bayonne_future_24h
Float32- Null values
- 14,288 (39.1%)
- Unique values
- 299 (0.8%)
- Mean ± Std
- 0.276 ± 0.0509
- Median ± IQR
- 0.283 ± 0.0470
- Min | Max
- 0.0970 | 0.414
weather_relative_humidity_2m_bayonne_future_1h
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 76.2 ± 16.1
- Median ± IQR
- 79.0 ± 25.0
- Min | Max
- 9.00 | 100.
weather_relative_humidity_2m_bayonne_future_24h
Float32- Null values
- 0 (0.0%)
- Unique values
- 91 (0.2%)
- Mean ± Std
- 76.2 ± 16.0
- Median ± IQR
- 79.0 ± 25.0
- Min | Max
- 9.00 | 100.
cal_hour_of_day_future_1h
Int8- Null values
- 0 (0.0%)
- Unique values
- 24 (< 0.1%)
- Mean ± Std
- 11.5 ± 6.92
- Median ± IQR
- 12.0 ± 11.0
- Min | Max
- 0.00 | 23.0
cal_hour_of_day_future_24h
Int8- Null values
- 0 (0.0%)
- Unique values
- 24 (< 0.1%)
- Mean ± Std
- 11.5 ± 6.92
- Median ± IQR
- 12.0 ± 11.0
- Min | Max
- 0.00 | 23.0
cal_day_of_week_future_1h
Int8- Null values
- 0 (0.0%)
- Unique values
- 7 (< 0.1%)
- Mean ± Std
- 4.00 ± 2.00
- Median ± IQR
- 4.00 ± 4.00
- Min | Max
- 1.00 | 7.00
cal_day_of_week_future_24h
Int8- Null values
- 0 (0.0%)
- Unique values
- 7 (< 0.1%)
- Mean ± Std
- 4.00 ± 2.00
- Median ± IQR
- 4.00 ± 4.00
- Min | Max
- 1.00 | 7.00
cal_day_of_year_future_1h
Int16- Null values
- 0 (0.0%)
- Unique values
- 366 (1.0%)
- Mean ± Std
- 181. ± 104.
- Median ± IQR
- 175. ± 177.
- Min | Max
- 1.00 | 366.
cal_day_of_year_future_24h
Int16- Null values
- 0 (0.0%)
- Unique values
- 366 (1.0%)
- Mean ± Std
- 181. ± 104.
- Median ± IQR
- 175. ± 176.
- Min | Max
- 1.00 | 366.
cal_year_future_1h
Int32- Null values
- 0 (0.0%)
- Unique values
- 5 (< 0.1%)
- Mean ± Std
- 2.02e+03 ± 1.25
- Median ± IQR
- 2.02e+03 ± 2.00
- Min | Max
- 2.02e+03 | 2.02e+03
cal_year_future_24h
Int32- Null values
- 0 (0.0%)
- Unique values
- 5 (< 0.1%)
- Mean ± Std
- 2.02e+03 ± 1.25
- Median ± IQR
- 2.02e+03 ± 2.00
- Min | Max
- 2.02e+03 | 2.02e+03
cal_is_holiday_future_1h
Boolean- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
cal_is_holiday_future_24h
Boolean- Null values
- 0 (0.0%)
- Unique values
- 2 (< 0.1%)
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | load_mw | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 1 | load_mw_lag_1h | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 2 | load_mw_lag_2h | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 3 | load_mw_lag_3h | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 4 | load_mw_lag_1d | Float64 | 0 (0.0%) | 23275 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 5 | load_mw_lag_1w | Float64 | 0 (0.0%) | 23283 (63.7%) | 4.99e+04 | 1.05e+04 | 2.87e+04 | 4.82e+04 | 8.66e+04 |
| 6 | load_mw_rolling_median_24h | Float64 | 0 (0.0%) | 9599 (26.3%) | 5.05e+04 | 9.29e+03 | 3.37e+04 | 4.74e+04 | 7.84e+04 |
| 7 | load_mw_rolling_median_7d | Float64 | 0 (0.0%) | 7137 (19.5%) | 5.01e+04 | 8.81e+03 | 3.85e+04 | 4.60e+04 | 7.39e+04 |
| 8 | load_mw_iqr_24h | Float64 | 0 (0.0%) | 5908 (16.2%) | 6.52e+03 | 1.56e+03 | 2.32e+03 | 6.43e+03 | 1.60e+04 |
| 9 | load_mw_iqr_7d | Float64 | 0 (0.0%) | 5327 (14.6%) | 8.30e+03 | 1.41e+03 | 5.04e+03 | 8.28e+03 | 1.86e+04 |
| 10 | weather_temperature_2m_paris_future_1h | Float32 | 0 (0.0%) | 1439 (3.9%) | 13.6 | 7.00 | -5.13 | 13.2 | 40.6 |
| 11 | weather_temperature_2m_paris_future_24h | Float32 | 0 (0.0%) | 1439 (3.9%) | 13.6 | 7.00 | -5.13 | 13.2 | 40.6 |
| 12 | weather_precipitation_paris_future_1h | Float32 | 0 (0.0%) | 135 (0.4%) | 0.0914 | 0.543 | 0.00 | 0.00 | 29.5 |
| 13 | weather_precipitation_paris_future_24h | Float32 | 0 (0.0%) | 135 (0.4%) | 0.0914 | 0.543 | 0.00 | 0.00 | 29.5 |
| 14 | weather_wind_speed_10m_paris_future_1h | Float32 | 0 (0.0%) | 1774 (4.9%) | 10.0 | 5.28 | 0.00 | 9.29 | 50.1 |
| 15 | weather_wind_speed_10m_paris_future_24h | Float32 | 0 (0.0%) | 1774 (4.9%) | 10.0 | 5.28 | 0.00 | 9.34 | 50.1 |
| 16 | weather_cloud_cover_paris_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 69.2 | 39.9 | -1.00 | 97.0 | 101. |
| 17 | weather_cloud_cover_paris_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 69.3 | 39.9 | -1.00 | 97.0 | 101. |
| 18 | weather_soil_moisture_1_to_3cm_paris_future_1h | Float32 | 14245 (39.0%) | 277 (0.8%) | 0.298 | 0.0397 | 0.139 | 0.304 | 0.436 |
| 19 | weather_soil_moisture_1_to_3cm_paris_future_24h | Float32 | 14222 (38.9%) | 277 (0.8%) | 0.298 | 0.0397 | 0.139 | 0.304 | 0.436 |
| 20 | weather_relative_humidity_2m_paris_future_1h | Float32 | 0 (0.0%) | 91 (0.2%) | 69.7 | 18.1 | 10.0 | 73.0 | 100. |
| 21 | weather_relative_humidity_2m_paris_future_24h | Float32 | 0 (0.0%) | 91 (0.2%) | 69.7 | 18.1 | 10.0 | 73.0 | 100. |
| 22 | weather_temperature_2m_lyon_future_1h | Float32 | 0 (0.0%) | 1565 (4.3%) | 14.1 | 7.97 | -5.89 | 13.8 | 40.3 |
| 23 | weather_temperature_2m_lyon_future_24h | Float32 | 0 (0.0%) | 1565 (4.3%) | 14.1 | 7.97 | -5.89 | 13.8 | 40.3 |
| 24 | weather_precipitation_lyon_future_1h | Float32 | 0 (0.0%) | 150 (0.4%) | 0.0993 | 0.609 | 0.00 | 0.00 | 26.3 |
| 25 | weather_precipitation_lyon_future_24h | Float32 | 0 (0.0%) | 150 (0.4%) | 0.0993 | 0.609 | 0.00 | 0.00 | 26.3 |
| 26 | weather_wind_speed_10m_lyon_future_1h | Float32 | 0 (0.0%) | 1727 (4.7%) | 8.08 | 6.05 | 0.00 | 6.48 | 43.2 |
| 27 | weather_wind_speed_10m_lyon_future_24h | Float32 | 0 (0.0%) | 1727 (4.7%) | 8.08 | 6.05 | 0.00 | 6.48 | 43.2 |
| 28 | weather_cloud_cover_lyon_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 64.5 | 41.8 | -1.00 | 92.0 | 101. |
| 29 | weather_cloud_cover_lyon_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 64.6 | 41.8 | -1.00 | 92.0 | 101. |
| 30 | weather_soil_moisture_1_to_3cm_lyon_future_1h | Float32 | 14245 (39.0%) | 290 (0.8%) | 0.296 | 0.0380 | 0.124 | 0.304 | 0.441 |
| 31 | weather_soil_moisture_1_to_3cm_lyon_future_24h | Float32 | 14222 (38.9%) | 290 (0.8%) | 0.296 | 0.0380 | 0.124 | 0.304 | 0.441 |
| 32 | weather_relative_humidity_2m_lyon_future_1h | Float32 | 0 (0.0%) | 89 (0.2%) | 68.6 | 18.7 | 12.0 | 71.0 | 100. |
| 33 | weather_relative_humidity_2m_lyon_future_24h | Float32 | 0 (0.0%) | 89 (0.2%) | 68.6 | 18.7 | 12.0 | 71.0 | 100. |
| 34 | weather_temperature_2m_marseille_future_1h | Float32 | 0 (0.0%) | 1276 (3.5%) | 17.5 | 6.15 | 0.317 | 17.1 | 36.6 |
| 35 | weather_temperature_2m_marseille_future_24h | Float32 | 0 (0.0%) | 1276 (3.5%) | 17.5 | 6.15 | 0.317 | 17.1 | 36.6 |
| 36 | weather_precipitation_marseille_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 0.0514 | 0.382 | 0.00 | 0.00 | 21.0 |
| 37 | weather_precipitation_marseille_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 0.0514 | 0.382 | 0.00 | 0.00 | 21.0 |
| 38 | weather_wind_speed_10m_marseille_future_1h | Float32 | 0 (0.0%) | 4018 (11.0%) | 14.9 | 10.8 | 0.00 | 11.8 | 74.6 |
| 39 | weather_wind_speed_10m_marseille_future_24h | Float32 | 0 (0.0%) | 4018 (11.0%) | 14.9 | 10.8 | 0.00 | 11.8 | 74.6 |
| 40 | weather_cloud_cover_marseille_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 46.6 | 44.5 | -1.00 | 31.0 | 101. |
| 41 | weather_cloud_cover_marseille_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 46.7 | 44.5 | -1.00 | 32.0 | 101. |
| 42 | weather_soil_moisture_1_to_3cm_marseille_future_1h | Float32 | 14311 (39.2%) | 354 (1.0%) | 0.227 | 0.0748 | 0.100 | 0.223 | 0.459 |
| 43 | weather_soil_moisture_1_to_3cm_marseille_future_24h | Float32 | 14288 (39.1%) | 354 (1.0%) | 0.226 | 0.0748 | 0.100 | 0.223 | 0.459 |
| 44 | weather_relative_humidity_2m_marseille_future_1h | Float32 | 0 (0.0%) | 86 (0.2%) | 63.4 | 13.2 | 14.0 | 64.0 | 99.0 |
| 45 | weather_relative_humidity_2m_marseille_future_24h | Float32 | 0 (0.0%) | 86 (0.2%) | 63.4 | 13.2 | 14.0 | 64.0 | 99.0 |
| 46 | weather_temperature_2m_toulouse_future_1h | Float32 | 0 (0.0%) | 1513 (4.1%) | 15.2 | 7.48 | -5.33 | 14.6 | 41.2 |
| 47 | weather_temperature_2m_toulouse_future_24h | Float32 | 0 (0.0%) | 1513 (4.1%) | 15.2 | 7.49 | -5.33 | 14.6 | 41.2 |
| 48 | weather_precipitation_toulouse_future_1h | Float32 | 0 (0.0%) | 121 (0.3%) | 0.0740 | 0.587 | 0.00 | 0.00 | 36.9 |
| 49 | weather_precipitation_toulouse_future_24h | Float32 | 0 (0.0%) | 121 (0.3%) | 0.0740 | 0.587 | 0.00 | 0.00 | 36.9 |
| 50 | weather_wind_speed_10m_toulouse_future_1h | Float32 | 0 (0.0%) | 2123 (5.8%) | 9.88 | 6.48 | 0.00 | 8.65 | 44.6 |
| 51 | weather_wind_speed_10m_toulouse_future_24h | Float32 | 0 (0.0%) | 2123 (5.8%) | 9.87 | 6.48 | 0.00 | 8.65 | 44.6 |
| 52 | weather_cloud_cover_toulouse_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 62.2 | 42.1 | -1.00 | 87.0 | 101. |
| 53 | weather_cloud_cover_toulouse_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 62.2 | 42.1 | -1.00 | 87.0 | 101. |
| 54 | weather_soil_moisture_1_to_3cm_toulouse_future_1h | Float32 | 14245 (39.0%) | 310 (0.8%) | 0.271 | 0.0505 | 0.104 | 0.285 | 0.454 |
| 55 | weather_soil_moisture_1_to_3cm_toulouse_future_24h | Float32 | 14222 (38.9%) | 310 (0.8%) | 0.271 | 0.0505 | 0.104 | 0.285 | 0.454 |
| 56 | weather_relative_humidity_2m_toulouse_future_1h | Float32 | 0 (0.0%) | 93 (0.3%) | 69.7 | 18.7 | 8.00 | 73.0 | 100. |
| 57 | weather_relative_humidity_2m_toulouse_future_24h | Float32 | 0 (0.0%) | 93 (0.3%) | 69.7 | 18.7 | 8.00 | 73.0 | 100. |
| 58 | weather_temperature_2m_lille_future_1h | Float32 | 0 (0.0%) | 2080 (5.7%) | 12.2 | 6.58 | -6.32 | 11.8 | 40.8 |
| 59 | weather_temperature_2m_lille_future_24h | Float32 | 0 (0.0%) | 2081 (5.7%) | 12.2 | 6.58 | -6.32 | 11.8 | 40.8 |
| 60 | weather_precipitation_lille_future_1h | Float32 | 0 (0.0%) | 75 (0.2%) | 0.0977 | 0.418 | 0.00 | 0.00 | 14.7 |
| 61 | weather_precipitation_lille_future_24h | Float32 | 0 (0.0%) | 75 (0.2%) | 0.0977 | 0.418 | 0.00 | 0.00 | 14.7 |
| 62 | weather_wind_speed_10m_lille_future_1h | Float32 | 0 (0.0%) | 2532 (6.9%) | 12.9 | 6.60 | 0.00 | 11.7 | 61.9 |
| 63 | weather_wind_speed_10m_lille_future_24h | Float32 | 0 (0.0%) | 2532 (6.9%) | 12.9 | 6.60 | 0.00 | 11.7 | 61.9 |
| 64 | weather_cloud_cover_lille_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 67.5 | 40.4 | -1.00 | 96.0 | 101. |
| 65 | weather_cloud_cover_lille_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 67.6 | 40.4 | -1.00 | 96.0 | 101. |
| 66 | weather_soil_moisture_1_to_3cm_lille_future_1h | Float32 | 14245 (39.0%) | 209 (0.6%) | 0.306 | 0.0315 | 0.203 | 0.311 | 0.422 |
| 67 | weather_soil_moisture_1_to_3cm_lille_future_24h | Float32 | 14222 (38.9%) | 209 (0.6%) | 0.306 | 0.0315 | 0.203 | 0.311 | 0.422 |
| 68 | weather_relative_humidity_2m_lille_future_1h | Float32 | 0 (0.0%) | 96 (0.3%) | 74.7 | 17.1 | 0.00 | 79.0 | 100. |
| 69 | weather_relative_humidity_2m_lille_future_24h | Float32 | 0 (0.0%) | 96 (0.3%) | 74.7 | 17.1 | 0.00 | 79.0 | 100. |
| 70 | weather_temperature_2m_limoges_future_1h | Float32 | 0 (0.0%) | 1572 (4.3%) | 12.7 | 7.35 | -7.70 | 12.1 | 39.7 |
| 71 | weather_temperature_2m_limoges_future_24h | Float32 | 0 (0.0%) | 1572 (4.3%) | 12.7 | 7.35 | -7.70 | 12.1 | 39.7 |
| 72 | weather_precipitation_limoges_future_1h | Float32 | 0 (0.0%) | 153 (0.4%) | 0.123 | 0.623 | 0.00 | 0.00 | 45.5 |
| 73 | weather_precipitation_limoges_future_24h | Float32 | 0 (0.0%) | 153 (0.4%) | 0.123 | 0.623 | 0.00 | 0.00 | 45.5 |
| 74 | weather_wind_speed_10m_limoges_future_1h | Float32 | 0 (0.0%) | 1359 (3.7%) | 7.58 | 4.77 | 0.00 | 6.52 | 33.9 |
| 75 | weather_wind_speed_10m_limoges_future_24h | Float32 | 0 (0.0%) | 1359 (3.7%) | 7.58 | 4.77 | 0.00 | 6.52 | 33.9 |
| 76 | weather_cloud_cover_limoges_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 66.6 | 40.8 | -1.00 | 93.0 | 101. |
| 77 | weather_cloud_cover_limoges_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 66.6 | 40.8 | -1.00 | 93.0 | 101. |
| 78 | weather_soil_moisture_1_to_3cm_limoges_future_1h | Float32 | 14245 (39.0%) | 302 (0.8%) | 0.283 | 0.0553 | 0.115 | 0.298 | 0.450 |
| 79 | weather_soil_moisture_1_to_3cm_limoges_future_24h | Float32 | 14222 (38.9%) | 302 (0.8%) | 0.282 | 0.0554 | 0.115 | 0.298 | 0.450 |
| 80 | weather_relative_humidity_2m_limoges_future_1h | Float32 | 0 (0.0%) | 93 (0.3%) | 75.2 | 19.9 | 8.00 | 81.0 | 100. |
| 81 | weather_relative_humidity_2m_limoges_future_24h | Float32 | 0 (0.0%) | 93 (0.3%) | 75.2 | 19.8 | 8.00 | 81.0 | 100. |
| 82 | weather_temperature_2m_nantes_future_1h | Float32 | 0 (0.0%) | 1539 (4.2%) | 13.8 | 6.65 | -3.86 | 13.4 | 43.4 |
| 83 | weather_temperature_2m_nantes_future_24h | Float32 | 0 (0.0%) | 1539 (4.2%) | 13.8 | 6.65 | -3.86 | 13.4 | 43.4 |
| 84 | weather_precipitation_nantes_future_1h | Float32 | 0 (0.0%) | 112 (0.3%) | 0.0870 | 0.437 | 0.00 | 0.00 | 14.1 |
| 85 | weather_precipitation_nantes_future_24h | Float32 | 0 (0.0%) | 112 (0.3%) | 0.0870 | 0.437 | 0.00 | 0.00 | 14.1 |
| 86 | weather_wind_speed_10m_nantes_future_1h | Float32 | 0 (0.0%) | 2833 (7.8%) | 13.4 | 6.91 | 0.00 | 12.0 | 58.6 |
| 87 | weather_wind_speed_10m_nantes_future_24h | Float32 | 0 (0.0%) | 2833 (7.8%) | 13.4 | 6.92 | 0.00 | 12.0 | 58.6 |
| 88 | weather_cloud_cover_nantes_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 65.1 | 41.3 | -1.00 | 94.0 | 101. |
| 89 | weather_cloud_cover_nantes_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 65.2 | 41.3 | -1.00 | 94.0 | 101. |
| 90 | weather_soil_moisture_1_to_3cm_nantes_future_1h | Float32 | 14311 (39.2%) | 314 (0.9%) | 0.276 | 0.0658 | 0.110 | 0.295 | 0.423 |
| 91 | weather_soil_moisture_1_to_3cm_nantes_future_24h | Float32 | 14288 (39.1%) | 314 (0.9%) | 0.276 | 0.0658 | 0.110 | 0.295 | 0.423 |
| 92 | weather_relative_humidity_2m_nantes_future_1h | Float32 | 0 (0.0%) | 94 (0.3%) | 74.0 | 17.3 | 7.00 | 78.0 | 100. |
| 93 | weather_relative_humidity_2m_nantes_future_24h | Float32 | 0 (0.0%) | 94 (0.3%) | 74.0 | 17.3 | 7.00 | 78.0 | 100. |
| 94 | weather_temperature_2m_strasbourg_future_1h | Float32 | 0 (0.0%) | 1525 (4.2%) | 12.7 | 7.74 | -9.31 | 12.3 | 38.8 |
| 95 | weather_temperature_2m_strasbourg_future_24h | Float32 | 0 (0.0%) | 1525 (4.2%) | 12.7 | 7.75 | -9.31 | 12.3 | 38.8 |
| 96 | weather_precipitation_strasbourg_future_1h | Float32 | 0 (0.0%) | 127 (0.3%) | 0.102 | 0.510 | 0.00 | 0.00 | 22.1 |
| 97 | weather_precipitation_strasbourg_future_24h | Float32 | 0 (0.0%) | 128 (0.4%) | 0.102 | 0.517 | 0.00 | 0.00 | 22.1 |
| 98 | weather_wind_speed_10m_strasbourg_future_1h | Float32 | 0 (0.0%) | 1520 (4.2%) | 8.45 | 5.05 | 0.00 | 7.52 | 38.1 |
| 99 | weather_wind_speed_10m_strasbourg_future_24h | Float32 | 0 (0.0%) | 1520 (4.2%) | 8.46 | 5.05 | 0.00 | 7.52 | 38.1 |
| 100 | weather_cloud_cover_strasbourg_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 69.7 | 40.2 | -1.00 | 98.0 | 101. |
| 101 | weather_cloud_cover_strasbourg_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 69.8 | 40.2 | -1.00 | 98.0 | 101. |
| 102 | weather_soil_moisture_1_to_3cm_strasbourg_future_1h | Float32 | 14245 (39.0%) | 304 (0.8%) | 0.329 | 0.0519 | 0.159 | 0.343 | 0.468 |
| 103 | weather_soil_moisture_1_to_3cm_strasbourg_future_24h | Float32 | 14222 (38.9%) | 304 (0.8%) | 0.329 | 0.0519 | 0.159 | 0.343 | 0.468 |
| 104 | weather_relative_humidity_2m_strasbourg_future_1h | Float32 | 0 (0.0%) | 88 (0.2%) | 71.9 | 18.5 | 13.0 | 75.0 | 100. |
| 105 | weather_relative_humidity_2m_strasbourg_future_24h | Float32 | 0 (0.0%) | 88 (0.2%) | 71.9 | 18.5 | 13.0 | 75.0 | 100. |
| 106 | weather_temperature_2m_brest_future_1h | Float32 | 0 (0.0%) | 1265 (3.5%) | 13.0 | 4.89 | -2.33 | 12.6 | 40.5 |
| 107 | weather_temperature_2m_brest_future_24h | Float32 | 0 (0.0%) | 1265 (3.5%) | 13.0 | 4.89 | -2.33 | 12.6 | 40.5 |
| 108 | weather_precipitation_brest_future_1h | Float32 | 0 (0.0%) | 108 (0.3%) | 0.107 | 0.432 | 0.00 | 0.00 | 12.7 |
| 109 | weather_precipitation_brest_future_24h | Float32 | 0 (0.0%) | 108 (0.3%) | 0.107 | 0.432 | 0.00 | 0.00 | 12.7 |
| 110 | weather_wind_speed_10m_brest_future_1h | Float32 | 0 (0.0%) | 3776 (10.3%) | 16.2 | 8.89 | 0.00 | 14.5 | 67.3 |
| 111 | weather_wind_speed_10m_brest_future_24h | Float32 | 0 (0.0%) | 3776 (10.3%) | 16.2 | 8.89 | 0.00 | 14.5 | 67.3 |
| 112 | weather_cloud_cover_brest_future_1h | Float32 | 0 (0.0%) | 102 (0.3%) | 67.9 | 39.8 | 0.00 | 96.0 | 101. |
| 113 | weather_cloud_cover_brest_future_24h | Float32 | 0 (0.0%) | 102 (0.3%) | 68.0 | 39.8 | 0.00 | 96.0 | 101. |
| 114 | weather_soil_moisture_1_to_3cm_brest_future_1h | Float32 | 14311 (39.2%) | 279 (0.8%) | 0.267 | 0.0571 | 0.116 | 0.278 | 0.409 |
| 115 | weather_soil_moisture_1_to_3cm_brest_future_24h | Float32 | 14288 (39.1%) | 279 (0.8%) | 0.266 | 0.0572 | 0.116 | 0.277 | 0.409 |
| 116 | weather_relative_humidity_2m_brest_future_1h | Float32 | 0 (0.0%) | 90 (0.2%) | 78.2 | 13.9 | 10.0 | 81.0 | 100. |
| 117 | weather_relative_humidity_2m_brest_future_24h | Float32 | 0 (0.0%) | 90 (0.2%) | 78.2 | 13.9 | 10.0 | 81.0 | 100. |
| 118 | weather_temperature_2m_bayonne_future_1h | Float32 | 0 (0.0%) | 1554 (4.3%) | 15.0 | 6.40 | -3.32 | 14.9 | 42.4 |
| 119 | weather_temperature_2m_bayonne_future_24h | Float32 | 0 (0.0%) | 1554 (4.3%) | 15.0 | 6.40 | -3.32 | 14.9 | 42.4 |
| 120 | weather_precipitation_bayonne_future_1h | Float32 | 0 (0.0%) | 131 (0.4%) | 0.145 | 0.553 | 0.00 | 0.00 | 18.5 |
| 121 | weather_precipitation_bayonne_future_24h | Float32 | 0 (0.0%) | 131 (0.4%) | 0.145 | 0.553 | 0.00 | 0.00 | 18.5 |
| 122 | weather_wind_speed_10m_bayonne_future_1h | Float32 | 0 (0.0%) | 2488 (6.8%) | 10.9 | 6.72 | 0.00 | 9.36 | 51.5 |
| 123 | weather_wind_speed_10m_bayonne_future_24h | Float32 | 0 (0.0%) | 2488 (6.8%) | 10.9 | 6.72 | 0.00 | 9.36 | 51.5 |
| 124 | weather_cloud_cover_bayonne_future_1h | Float32 | 0 (0.0%) | 103 (0.3%) | 66.4 | 40.7 | -1.00 | 95.0 | 101. |
| 125 | weather_cloud_cover_bayonne_future_24h | Float32 | 0 (0.0%) | 103 (0.3%) | 66.5 | 40.7 | -1.00 | 95.0 | 101. |
| 126 | weather_soil_moisture_1_to_3cm_bayonne_future_1h | Float32 | 14311 (39.2%) | 299 (0.8%) | 0.276 | 0.0510 | 0.0970 | 0.284 | 0.414 |
| 127 | weather_soil_moisture_1_to_3cm_bayonne_future_24h | Float32 | 14288 (39.1%) | 299 (0.8%) | 0.276 | 0.0509 | 0.0970 | 0.283 | 0.414 |
| 128 | weather_relative_humidity_2m_bayonne_future_1h | Float32 | 0 (0.0%) | 91 (0.2%) | 76.2 | 16.1 | 9.00 | 79.0 | 100. |
| 129 | weather_relative_humidity_2m_bayonne_future_24h | Float32 | 0 (0.0%) | 91 (0.2%) | 76.2 | 16.0 | 9.00 | 79.0 | 100. |
| 130 | cal_hour_of_day_future_1h | Int8 | 0 (0.0%) | 24 (< 0.1%) | 11.5 | 6.92 | 0.00 | 12.0 | 23.0 |
| 131 | cal_hour_of_day_future_24h | Int8 | 0 (0.0%) | 24 (< 0.1%) | 11.5 | 6.92 | 0.00 | 12.0 | 23.0 |
| 132 | cal_day_of_week_future_1h | Int8 | 0 (0.0%) | 7 (< 0.1%) | 4.00 | 2.00 | 1.00 | 4.00 | 7.00 |
| 133 | cal_day_of_week_future_24h | Int8 | 0 (0.0%) | 7 (< 0.1%) | 4.00 | 2.00 | 1.00 | 4.00 | 7.00 |
| 134 | cal_day_of_year_future_1h | Int16 | 0 (0.0%) | 366 (1.0%) | 181. | 104. | 1.00 | 175. | 366. |
| 135 | cal_day_of_year_future_24h | Int16 | 0 (0.0%) | 366 (1.0%) | 181. | 104. | 1.00 | 175. | 366. |
| 136 | cal_year_future_1h | Int32 | 0 (0.0%) | 5 (< 0.1%) | 2.02e+03 | 1.25 | 2.02e+03 | 2.02e+03 | 2.02e+03 |
| 137 | cal_year_future_24h | Int32 | 0 (0.0%) | 5 (< 0.1%) | 2.02e+03 | 1.25 | 2.02e+03 | 2.02e+03 | 2.02e+03 |
| 138 | cal_is_holiday_future_1h | Boolean | 0 (0.0%) | 2 (< 0.1%) | |||||
| 139 | cal_is_holiday_future_24h | Boolean | 0 (0.0%) | 2 (< 0.1%) |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Let’s build training and evaluation targets for all possible horizons from 1 to 24 hours.
horizons = range(1, 25)
target_column_name_pattern = "load_mw_horizon_{horizon}h"
@skrub.deferred
def build_targets(prediction_time, electricity, horizons):
return prediction_time.join(
electricity.with_columns(
[
pl.col("load_mw")
.shift(-h)
.alias(target_column_name_pattern.format(horizon=h))
for h in horizons
]
),
left_on="prediction_time",
right_on="time",
)
targets = build_targets(prediction_time, electricity, horizons)
targets
Show graph
| prediction_time | load_mw | load_mw_horizon_1h | load_mw_horizon_2h | load_mw_horizon_3h | load_mw_horizon_4h | load_mw_horizon_5h | load_mw_horizon_6h | load_mw_horizon_7h | load_mw_horizon_8h | load_mw_horizon_9h | load_mw_horizon_10h | load_mw_horizon_11h | load_mw_horizon_12h | load_mw_horizon_13h | load_mw_horizon_14h | load_mw_horizon_15h | load_mw_horizon_16h | load_mw_horizon_17h | load_mw_horizon_18h | load_mw_horizon_19h | load_mw_horizon_20h | load_mw_horizon_21h | load_mw_horizon_22h | load_mw_horizon_23h | load_mw_horizon_24h |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2021-03-30 00:00:00+00:00 | 46395.0 | 44269.0 | 43874.0 | 46197.0 | 51913.0 | 56939.0 | 58329.0 | 57671.0 | 55421.0 | 54578.0 | 54866.0 | 53069.0 | 50920.0 | 49051.0 | 47607.0 | 46991.0 | 48358.0 | 50709.0 | 51211.0 | 49234.0 | 49122.0 | 49962.0 | 47394.0 | 45452.0 | 44510.0 |
| 2021-03-30 01:00:00+00:00 | 44269.0 | 43874.0 | 46197.0 | 51913.0 | 56939.0 | 58329.0 | 57671.0 | 55421.0 | 54578.0 | 54866.0 | 53069.0 | 50920.0 | 49051.0 | 47607.0 | 46991.0 | 48358.0 | 50709.0 | 51211.0 | 49234.0 | 49122.0 | 49962.0 | 47394.0 | 45452.0 | 44510.0 | 42417.0 |
| 2021-03-30 02:00:00+00:00 | 43874.0 | 46197.0 | 51913.0 | 56939.0 | 58329.0 | 57671.0 | 55421.0 | 54578.0 | 54866.0 | 53069.0 | 50920.0 | 49051.0 | 47607.0 | 46991.0 | 48358.0 | 50709.0 | 51211.0 | 49234.0 | 49122.0 | 49962.0 | 47394.0 | 45452.0 | 44510.0 | 42417.0 | 41633.0 |
| 2021-03-30 03:00:00+00:00 | 46197.0 | 51913.0 | 56939.0 | 58329.0 | 57671.0 | 55421.0 | 54578.0 | 54866.0 | 53069.0 | 50920.0 | 49051.0 | 47607.0 | 46991.0 | 48358.0 | 50709.0 | 51211.0 | 49234.0 | 49122.0 | 49962.0 | 47394.0 | 45452.0 | 44510.0 | 42417.0 | 41633.0 | 43640.0 |
| 2021-03-30 04:00:00+00:00 | 51913.0 | 56939.0 | 58329.0 | 57671.0 | 55421.0 | 54578.0 | 54866.0 | 53069.0 | 50920.0 | 49051.0 | 47607.0 | 46991.0 | 48358.0 | 50709.0 | 51211.0 | 49234.0 | 49122.0 | 49962.0 | 47394.0 | 45452.0 | 44510.0 | 42417.0 | 41633.0 | 43640.0 | 48555.0 |
| 2025-05-30 19:00:00+00:00 | 41584.0 | 42931.0 | 43812.0 | 41966.0 | 38248.0 | 36750.0 | 34055.0 | 32579.0 | 32203.0 | 32126.0 | 32844.0 | 34701.0 | 37489.0 | 39643.0 | 40981.0 | 42892.0 | 42667.0 | 40909.0 | 40129.0 | 38743.0 | 38914.0 | 40175.0 | 40890.0 | 39980.0 | 39069.0 |
| 2025-05-30 20:00:00+00:00 | 42931.0 | 43812.0 | 41966.0 | 38248.0 | 36750.0 | 34055.0 | 32579.0 | 32203.0 | 32126.0 | 32844.0 | 34701.0 | 37489.0 | 39643.0 | 40981.0 | 42892.0 | 42667.0 | 40909.0 | 40129.0 | 38743.0 | 38914.0 | 40175.0 | 40890.0 | 39980.0 | 39069.0 | 40387.0 |
| 2025-05-30 21:00:00+00:00 | 43812.0 | 41966.0 | 38248.0 | 36750.0 | 34055.0 | 32579.0 | 32203.0 | 32126.0 | 32844.0 | 34701.0 | 37489.0 | 39643.0 | 40981.0 | 42892.0 | 42667.0 | 40909.0 | 40129.0 | 38743.0 | 38914.0 | 40175.0 | 40890.0 | 39980.0 | 39069.0 | 40387.0 | 41174.0 |
| 2025-05-30 22:00:00+00:00 | 41966.0 | 38248.0 | 36750.0 | 34055.0 | 32579.0 | 32203.0 | 32126.0 | 32844.0 | 34701.0 | 37489.0 | 39643.0 | 40981.0 | 42892.0 | 42667.0 | 40909.0 | 40129.0 | 38743.0 | 38914.0 | 40175.0 | 40890.0 | 39980.0 | 39069.0 | 40387.0 | 41174.0 | 39664.0 |
| 2025-05-30 23:00:00+00:00 | 38248.0 | 36750.0 | 34055.0 | 32579.0 | 32203.0 | 32126.0 | 32844.0 | 34701.0 | 37489.0 | 39643.0 | 40981.0 | 42892.0 | 42667.0 | 40909.0 | 40129.0 | 38743.0 | 38914.0 | 40175.0 | 40890.0 | 39980.0 | 39069.0 | 40387.0 | 41174.0 | 39664.0 | 36067.0 |
prediction_time
Datetime- Null values
- 0 (0.0%)
- Unique values
- 36,552 (100.0%)
- Min | Max
- 2021-03-30T00:00:00+00:00 | 2025-05-30T23:00:00+00:00
load_mw
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,274 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_1h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,275 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_2h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,276 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_3h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,276 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_4h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,277 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_5h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,277 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_6h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,278 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_7h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,278 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_8h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,278 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_9h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_10h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_11h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_12h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,278 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_13h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,278 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_14h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_15h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_16h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_17h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_18h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_19h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_20h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,279 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_21h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,280 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_22h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,280 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_23h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,281 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
load_mw_horizon_24h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,280 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | prediction_time | Datetime | 0 (0.0%) | 36552 (100.0%) | 2021-03-30T00:00:00+00:00 | 2025-05-30T23:00:00+00:00 | |||
| 1 | load_mw | Float64 | 0 (0.0%) | 23274 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 2 | load_mw_horizon_1h | Float64 | 0 (0.0%) | 23275 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 3 | load_mw_horizon_2h | Float64 | 0 (0.0%) | 23276 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 4 | load_mw_horizon_3h | Float64 | 0 (0.0%) | 23276 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 5 | load_mw_horizon_4h | Float64 | 0 (0.0%) | 23277 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 6 | load_mw_horizon_5h | Float64 | 0 (0.0%) | 23277 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 7 | load_mw_horizon_6h | Float64 | 0 (0.0%) | 23278 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 8 | load_mw_horizon_7h | Float64 | 0 (0.0%) | 23278 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 9 | load_mw_horizon_8h | Float64 | 0 (0.0%) | 23278 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 10 | load_mw_horizon_9h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 11 | load_mw_horizon_10h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 12 | load_mw_horizon_11h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 13 | load_mw_horizon_12h | Float64 | 0 (0.0%) | 23278 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 14 | load_mw_horizon_13h | Float64 | 0 (0.0%) | 23278 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 15 | load_mw_horizon_14h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 16 | load_mw_horizon_15h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 17 | load_mw_horizon_16h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 18 | load_mw_horizon_17h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 19 | load_mw_horizon_18h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 20 | load_mw_horizon_19h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 21 | load_mw_horizon_20h | Float64 | 0 (0.0%) | 23279 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 22 | load_mw_horizon_21h | Float64 | 0 (0.0%) | 23280 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 23 | load_mw_horizon_22h | Float64 | 0 (0.0%) | 23280 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 24 | load_mw_horizon_23h | Float64 | 0 (0.0%) | 23281 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
| 25 | load_mw_horizon_24h | Float64 | 0 (0.0%) | 23280 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
For now, let’s focus on the last horizon (24 hours) to train a model predicting the electricity load at the next 24 hours.
horizon_of_interest = horizons[-1] # Focus on the 24-hour horizon
target_column_name = target_column_name_pattern.format(horizon=horizon_of_interest)
predicted_target_column_name = "predicted_" + target_column_name
target = targets[target_column_name].skb.mark_as_y()
target
Show graph
| load_mw_horizon_24h |
|---|
| 44510.0 |
| 42417.0 |
| 41633.0 |
| 43640.0 |
| 48555.0 |
| 39069.0 |
| 40387.0 |
| 41174.0 |
| 39664.0 |
| 36067.0 |
load_mw_horizon_24h
Float64- Null values
- 0 (0.0%)
- Unique values
- 23,280 (63.7%)
- Mean ± Std
- 4.98e+04 ± 1.05e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 2.87e+04 | 8.66e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | load_mw_horizon_24h | Float64 | 0 (0.0%) | 23280 (63.7%) | 4.98e+04 | 1.05e+04 | 2.87e+04 | 4.81e+04 | 8.66e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Let’s define our first single output prediction pipeline. This pipeline
chains our previous feature engineering steps with a skrub.DropCols step to
drop some columns that we do not want to use as features, and a
HistGradientBoostingRegressor model from scikit-learn.
The skrub.choose_from, skrub.choose_float, and skrub.choose_int
functions are used to define hyperparameters that can be tuned via
cross-validated randomized search.
from sklearn.ensemble import HistGradientBoostingRegressor
import skrub.selectors as s
features_with_dropped_cols = features.skb.apply(
skrub.DropCols(
cols=skrub.choose_from(
{
"none": s.glob(""), # No column has an empty name.
"load": s.glob("load_*"),
"rolling_load": s.glob("load_mw_rolling_*"),
"weather": s.glob("weather_*"),
"temperature": s.glob("weather_temperature_*"),
"moisture": s.glob("weather_moisture_*"),
"cloud_cover": s.glob("weather_cloud_cover_*"),
"calendar": s.glob("cal_*"),
"holiday": s.glob("cal_is_holiday*"),
"future_1h": s.glob("*_future_1h"),
"future_24h": s.glob("*_future_24h"),
"non_paris_weather": s.glob("weather_*") & ~s.glob("weather_*_paris_*"),
},
name="dropped_cols",
)
)
)
hgbr_predictions = features_with_dropped_cols.skb.apply(
HistGradientBoostingRegressor(
random_state=0,
loss=skrub.choose_from(["squared_error", "poisson", "gamma"], name="loss"),
learning_rate=skrub.choose_float(
0.01, 1, default=0.1, log=True, name="learning_rate"
),
max_leaf_nodes=skrub.choose_int(
3, 300, default=30, log=True, name="max_leaf_nodes"
),
),
y=target,
)
hgbr_predictions
Show graph
| load_mw_horizon_24h |
|---|
| 45611.43852796582 |
| 43455.14855909194 |
| 43200.53880546226 |
| 45175.28771945802 |
| 50042.302586037855 |
| 39641.18199990779 |
| 40350.146350713134 |
| 41434.14838508511 |
| 39556.26556642263 |
| 36952.556503271815 |
load_mw_horizon_24h
Float64- Null values
- 0 (0.0%)
- Unique values
- 36,549 (100.0%)
- Mean ± Std
- 4.98e+04 ± 1.04e+04
- Median ± IQR
- 4.81e+04 ± 1.41e+04
- Min | Max
- 3.00e+04 | 8.44e+04
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
| Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
|---|---|---|---|---|---|---|---|---|---|
| 0 | load_mw_horizon_24h | Float64 | 0 (0.0%) | 36549 (100.0%) | 4.98e+04 | 1.04e+04 | 3.00e+04 | 4.81e+04 | 8.44e+04 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
The predictions expression captures the whole expression graph that
includes the feature engineering steps, the target variable, and the model
training step.
In particular, the input data keys for the full pipeline can be inspected as follows:
hgbr_predictions.skb.get_data().keys()
dict_keys(['prediction_start_time', 'prediction_end_time', 'historical_data_start_time', 'historical_data_end_time', 'data_source_folder', 'city_names'])
Furthermore, the hyper-parameters of the full pipeline can be retrieved as follows:
hgbr_pipeline = hgbr_predictions.skb.get_pipeline()
hgbr_pipeline.describe_params()
{'dropped_cols': 'none',
'learning_rate': 0.1,
'loss': 'squared_error',
'max_leaf_nodes': 30}
When running this notebook locally, you can also interactively inspect all the steps of the DAG using the following (once uncommented):
# predictions.skb.full_report()
Since we passed input values to all the upstream skrub variables, skrub
automatically evaluates the whole expression graph graph (train and predict
on the same data) so that we can interactively check that everything will
work as expected.
Let’s use altair to visualize the predictions against the target values for the last 24 hours of the prediction time range used to train the model. This allows us can (over)fit the data with the features at hand.
altair.Chart(
pl.concat(
[
targets.skb.eval(),
hgbr_predictions.rename(
{target_column_name: predicted_target_column_name}
).skb.eval(),
],
how="horizontal",
).tail(24 * 7)
).transform_fold(
[target_column_name, predicted_target_column_name],
).mark_line(
tooltip=True
).encode(
x="prediction_time:T", y="value:Q", color="key:N"
).interactive()
Assessing the model performance via cross-validation#
Being able to fit the training data is not enough. We need to assess the ability of the training pipeline to learn a predictive model that can generalize to unseen data.
Furthermore, we want to assess the uncertainty of this estimate of the generalization performance via time-based cross-validation, also known as backtesting.
from sklearn.model_selection import TimeSeriesSplit
max_train_size = 2 * 52 * 24 * 7 # max ~2 years of training data
test_size = 24 * 7 * 24 # 24 weeks of test data
gap = 7 * 24 # 1 week gap between train and test sets
ts_cv_5 = TimeSeriesSplit(
n_splits=5, max_train_size=max_train_size, test_size=test_size, gap=gap
)
for fold_idx, (train_idx, test_idx) in enumerate(
ts_cv_5.split(prediction_time.skb.eval())
):
print(f"CV iteration #{fold_idx}")
train_datetimes = prediction_time.skb.eval()[train_idx]
test_datetimes = prediction_time.skb.eval()[test_idx]
print(
f"Train: {train_datetimes.shape[0]} rows, "
f"Test: {test_datetimes.shape[0]} rows"
)
print(f"Train time range: {train_datetimes[0, 0]} to " f"{train_datetimes[-1, 0]} ")
print(f"Test time range: {test_datetimes[0, 0]} to " f"{test_datetimes[-1, 0]} ")
print()
CV iteration #0
Train: 16224 rows, Test: 4032 rows
Train time range: 2021-03-30 00:00:00+00:00 to 2023-02-03 23:00:00+00:00
Test time range: 2023-02-11 00:00:00+00:00 to 2023-07-28 23:00:00+00:00
CV iteration #1
Train: 17472 rows, Test: 4032 rows
Train time range: 2021-07-24 00:00:00+00:00 to 2023-07-21 23:00:00+00:00
Test time range: 2023-07-29 00:00:00+00:00 to 2024-01-12 23:00:00+00:00
CV iteration #2
Train: 17472 rows, Test: 4032 rows
Train time range: 2022-01-08 00:00:00+00:00 to 2024-01-05 23:00:00+00:00
Test time range: 2024-01-13 00:00:00+00:00 to 2024-06-28 23:00:00+00:00
CV iteration #3
Train: 17472 rows, Test: 4032 rows
Train time range: 2022-06-25 00:00:00+00:00 to 2024-06-21 23:00:00+00:00
Test time range: 2024-06-29 00:00:00+00:00 to 2024-12-13 23:00:00+00:00
CV iteration #4
Train: 17472 rows, Test: 4032 rows
Train time range: 2022-12-10 00:00:00+00:00 to 2024-12-06 23:00:00+00:00
Test time range: 2024-12-14 00:00:00+00:00 to 2025-05-30 23:00:00+00:00
from sklearn.metrics import make_scorer, mean_absolute_percentage_error, get_scorer
from sklearn.metrics import d2_tweedie_score
cv_results = hgbr_predictions.skb.cross_validate(
cv=ts_cv_5,
scoring={
"mape": make_scorer(mean_absolute_percentage_error),
"r2": get_scorer("r2"),
"d2_poisson": make_scorer(d2_tweedie_score, power=1.0),
"d2_gamma": make_scorer(d2_tweedie_score, power=2.0),
},
return_train_score=True,
return_pipeline=True,
verbose=1,
n_jobs=-1,
)
cv_results.round(3)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done 5 out of 5 | elapsed: 8.4s finished
| fit_time | score_time | test_mape | train_mape | test_r2 | train_r2 | test_d2_poisson | train_d2_poisson | test_d2_gamma | train_d2_gamma | pipeline | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2.936 | 0.060 | 0.027 | 0.012 | 0.963 | 0.994 | 0.962 | 0.994 | 0.961 | 0.994 | SkrubPipeline(expr=<Apply HistGradientBoosting... |
| 1 | 3.229 | 0.063 | 0.024 | 0.013 | 0.978 | 0.994 | 0.977 | 0.994 | 0.976 | 0.993 | SkrubPipeline(expr=<Apply HistGradientBoosting... |
| 2 | 3.242 | 0.060 | 0.023 | 0.014 | 0.974 | 0.993 | 0.974 | 0.993 | 0.975 | 0.992 | SkrubPipeline(expr=<Apply HistGradientBoosting... |
| 3 | 3.207 | 0.061 | 0.019 | 0.014 | 0.980 | 0.993 | 0.980 | 0.992 | 0.980 | 0.992 | SkrubPipeline(expr=<Apply HistGradientBoosting... |
| 4 | 2.154 | 0.037 | 0.023 | 0.014 | 0.977 | 0.993 | 0.978 | 0.992 | 0.978 | 0.992 | SkrubPipeline(expr=<Apply HistGradientBoosting... |
def collect_cv_predictions(pipelines, cv_splitter, predictions, prediction_time):
index_generator = cv_splitter.split(prediction_time.skb.eval())
def splitter(X, y, index_generator):
"""Workaround to transform a scikit-learn splitter into a function understood
by `skrub.train_test_split`."""
train_idx, test_idx = next(index_generator)
return X[train_idx], X[test_idx], y[train_idx], y[test_idx]
results = []
for (_, test_idx), pipeline in zip(
cv_splitter.split(prediction_time.skb.eval()), pipelines
):
split = predictions.skb.train_test_split(
predictions.skb.get_data(),
splitter=splitter,
index_generator=index_generator,
)
results.append(
pl.DataFrame(
{
"prediction_time": prediction_time.skb.eval()[test_idx],
"load_mw": split["y_test"],
"predicted_load_mw": pipeline.predict(split["test"]),
}
)
)
return results
cv_predictions = collect_cv_predictions(
cv_results["pipeline"], ts_cv_5, hgbr_predictions, prediction_time
)
cv_predictions[0]
| prediction_time | load_mw | predicted_load_mw |
|---|---|---|
| datetime[μs, UTC] | f64 | f64 |
| 2023-02-11 00:00:00 UTC | 59258.0 | 59855.334418 |
| 2023-02-11 01:00:00 UTC | 58654.0 | 59958.654564 |
| 2023-02-11 02:00:00 UTC | 56155.0 | 57666.184522 |
| 2023-02-11 03:00:00 UTC | 54463.0 | 55832.880673 |
| 2023-02-11 04:00:00 UTC | 54698.0 | 57121.984097 |
| … | … | … |
| 2023-07-28 19:00:00 UTC | 38781.0 | 40093.987086 |
| 2023-07-28 20:00:00 UTC | 38455.0 | 39343.771368 |
| 2023-07-28 21:00:00 UTC | 39972.0 | 40738.151594 |
| 2023-07-28 22:00:00 UTC | 39825.0 | 39449.468131 |
| 2023-07-28 23:00:00 UTC | 36822.0 | 35828.293662 |
def lorenz_curve(observed_value, predicted_value, n_samples=1_000):
"""Compute the Lorenz curve for a given true and predicted values."""
def gini_index(cum_proportion_population, cum_proportion_y_true):
from sklearn.metrics import auc
return 1 - 2 * auc(cum_proportion_population, cum_proportion_y_true)
observed_value = np.asarray(observed_value)
predicted_value = np.asarray(predicted_value)
sort_idx = np.argsort(predicted_value)
observed_value_sorted = observed_value[sort_idx]
original_n_samples = observed_value_sorted.shape[0]
cum_proportion_population = np.cumsum(np.ones(original_n_samples))
cum_proportion_population /= cum_proportion_population[-1]
cum_proportion_y_true = np.cumsum(observed_value_sorted)
cum_proportion_y_true /= cum_proportion_y_true[-1]
gini_model = gini_index(cum_proportion_population, cum_proportion_y_true)
cum_proportion_population_interpolated = np.linspace(0, 1, n_samples)
cum_proportion_y_true_interpolated = np.interp(
cum_proportion_population_interpolated,
cum_proportion_population,
cum_proportion_y_true,
)
return pl.DataFrame(
{
"cum_population": cum_proportion_population_interpolated,
"cum_observed": cum_proportion_y_true_interpolated,
}
).with_columns(
pl.lit(gini_model).alias("gini_index"),
)
def plot_lorenz_curve(cv_predictions, n_samples=1_000):
"""Plot the Lorenz curve for a given true and predicted values."""
results = []
for fold_idx, predictions in enumerate(cv_predictions):
results.append(
lorenz_curve(
observed_value=predictions["load_mw"],
predicted_value=predictions["predicted_load_mw"],
n_samples=n_samples,
).with_columns(
pl.lit(fold_idx).alias("fold_idx"),
pl.lit("Model").alias("model"),
)
)
results.append(
lorenz_curve(
observed_value=predictions["load_mw"],
predicted_value=predictions["load_mw"],
n_samples=n_samples,
).with_columns(
pl.lit(fold_idx).alias("fold_idx"),
pl.lit("Oracle").alias("model"),
)
)
results = pl.concat(results)
gini_stats = results.group_by("model").agg(
[
pl.col("gini_index")
.mean()
.map_elements(lambda x: f"{x:.4f}", return_dtype=pl.String)
.alias("gini_mean"),
pl.col("gini_index")
.std()
.map_elements(lambda x: f"{x:.4f}", return_dtype=pl.String)
.alias("gini_std_dev"),
]
)
results = results.join(gini_stats, on="model").with_columns(
pl.format("{} (Gini: {} +/- {})", "model", "gini_mean", "gini_std_dev").alias(
"model_label"
)
)
model_chart = (
altair.Chart(results)
.mark_line(strokeDash=[4, 2, 4, 2], opacity=0.8, tooltip=True)
.encode(
x=altair.X(
"cum_population:Q",
title="Fraction of observations sorted by predicted label",
),
y=altair.Y("cum_observed:Q", title="Cumulative observed load proportion"),
color=altair.Color(
"model_label:N", legend=altair.Legend(title="Models"), sort=None
),
detail="fold_idx:N",
)
)
diagonal_chart = (
altair.Chart(
pl.DataFrame(
{
"cum_population": [0, 1],
"cum_observed": [0, 1],
"model_label": "Non-informative model (Gini = 0.0)",
}
)
)
.mark_line(strokeDash=[4, 4], opacity=0.5, tooltip=True)
.encode(
x=altair.X(
"cum_population:Q",
title="Fraction of observations sorted by predicted label",
),
y=altair.Y("cum_observed:Q", title="Cumulative observed load proportion"),
color=altair.Color(
"model_label:N", legend=altair.Legend(title="Models"), sort=None
),
)
)
return model_chart + diagonal_chart
plot_lorenz_curve(cv_predictions, n_samples=500).interactive()
def plot_reliability_diagram(
cv_predictions, kind="mean", quantile_level=0.5, n_bins=10
):
# min and max load over all predictions and observations for any folds:
all_loads = pl.concat(
[
cv_prediction.select(["load_mw", "predicted_load_mw"])
for cv_prediction in cv_predictions
]
)
all_loads = pl.concat(all_loads["load_mw", "predicted_load_mw"])
min_load, max_load = all_loads.min(), all_loads.max()
scale = altair.Scale(domain=[min_load, max_load])
# Create the perfect line
chart = (
altair.Chart(
pl.DataFrame(
{
"mean_predicted_load_mw": [min_load, max_load],
"mean_load_mw": [min_load, max_load],
"label": ["Perfect"] * 2,
}
)
)
.mark_line(tooltip=True, opacity=0.8, strokeDash=[5, 5])
.encode(
x=altair.X("mean_predicted_load_mw:Q", scale=scale),
y=altair.Y("mean_load_mw:Q", scale=scale),
color=altair.Color(
"label:N",
scale=altair.Scale(range=["black"]),
legend=altair.Legend(title="Legend"),
),
)
)
# Add lines for each CV fold with date labels
for fold_idx, cv_predictions_i in enumerate(cv_predictions):
# Get date range for this CV fold
min_date = cv_predictions_i["prediction_time"].min().strftime("%Y-%m-%d")
max_date = cv_predictions_i["prediction_time"].max().strftime("%Y-%m-%d")
fold_label = f"#{fold_idx} - {min_date} to {max_date}"
if kind == "mean":
y_name = "mean_load_mw"
agg_expr = pl.col("load_mw")
elif kind == "quantile":
y_name = "quantile_of_load_mw"
agg_expr = (
pl.col("load_mw").quantile(quantile_level)
)
else:
raise ValueError(f"Unknown kind: {kind}. Use 'mean' or 'quantile'.")
mean_per_bins = (
cv_predictions_i.group_by(
pl.col("predicted_load_mw").qcut(np.linspace(0, 1, n_bins))
)
.agg(
[
agg_expr.alias(y_name),
pl.col("predicted_load_mw").mean().alias("mean_predicted_load_mw"),
]
)
.sort("predicted_load_mw")
.with_columns(pl.lit(fold_label).alias("fold_label"))
)
chart += (
altair.Chart(mean_per_bins)
.mark_line(tooltip=True, point=True, opacity=0.8)
.encode(
x=altair.X("mean_predicted_load_mw:Q", scale=scale),
y=altair.Y(f"{y_name}:Q", scale=scale),
color=altair.Color(
"fold_label:N",
legend=altair.Legend(title=None),
),
detail=altair.Detail("fold_label:N"),
)
)
return chart.resolve_scale(color="independent")
plot_reliability_diagram(cv_predictions).interactive().properties(
title="Reliability diagram from cross-validation predictions"
)
def plot_residuals_vs_predicted(cv_predictions):
"""Plot residuals vs predicted values scatter plot for all CV folds."""
all_scatter_plots = []
x_title = "Predicted Load (MW)"
y_title = "Residual load (MW): predicted - actual"
for i, cv_prediction in enumerate(cv_predictions):
# Get date range for this CV fold
min_date = cv_prediction["prediction_time"].min().strftime("%Y-%m-%d")
max_date = cv_prediction["prediction_time"].max().strftime("%Y-%m-%d")
fold_label = f"#{i+1} - {min_date} to {max_date}"
# Calculate residuals
residuals_data = cv_prediction.with_columns(
[(pl.col("predicted_load_mw") - pl.col("load_mw")).alias("residual")]
).with_columns([pl.lit(fold_label).alias("fold_label")])
# Create scatter plot for this CV fold
scatter_plot = (
altair.Chart(residuals_data)
.mark_circle(opacity=0.6, size=20)
.encode(
x=altair.X(
"predicted_load_mw:Q",
title=x_title,
scale=altair.Scale(zero=False),
),
y=altair.Y("residual:Q", title=y_title),
color=altair.Color("fold_label:N", legend=None),
tooltip=[
"prediction_time:T",
"load_mw:Q",
"predicted_load_mw:Q",
"residual:Q",
"fold_label:N",
],
)
)
all_scatter_plots.append(scatter_plot)
# Get the range of predicted values for the perfect line
all_predictions = pl.concat(
[cv_pred["predicted_load_mw"] for cv_pred in cv_predictions]
)
min_pred, max_pred = all_predictions.min(), all_predictions.max()
# Create perfect residuals line at y=0
perfect_line = (
altair.Chart(
pl.DataFrame(
{
"predicted_load_mw": [min_pred, max_pred],
"perfect_residual": [0, 0],
"label": ["Perfect"] * 2,
}
)
)
.mark_line(strokeDash=[5, 5], opacity=0.8, color="black")
.encode(
x=altair.X("predicted_load_mw:Q", title=x_title),
y=altair.Y("perfect_residual:Q", title=y_title),
color=altair.Color(
"label:N",
scale=altair.Scale(range=["black"]),
legend=None,
),
)
)
# Combine all scatter plots
combined_scatter = all_scatter_plots[0]
for plot in all_scatter_plots[1:]:
combined_scatter += plot
# Layer the scatter plots with the perfect line
return (combined_scatter + perfect_line).resolve_scale(color="independent")
plot_residuals_vs_predicted(cv_predictions).interactive().properties(
title="Residuals vs Predicted Values from cross-validation predictions"
)
def plot_binned_residuals(cv_predictions, by="hour"):
"""Plot the average residuals binned by time period, one line per CV fold."""
# Configure binning based on the 'by' parameter
if by == "hour":
time_column = "hour_of_day"
time_extractor = pl.col("prediction_time").dt.hour().alias(time_column)
x_title = "Hour of day"
elif by == "month":
time_column = "month_of_year"
time_extractor = pl.col("prediction_time").dt.month().alias(time_column)
x_title = "Month of year"
else:
raise ValueError(f"Unsupported binning method: {by}. Use 'hour' or 'month'.")
all_iqr_bands = []
all_mean_lines = []
time_range = None # Will store the min/max time values for the perfect line
for i, cv_prediction in enumerate(cv_predictions):
# Get date range for this CV fold
min_date = cv_prediction["prediction_time"].min().strftime("%Y-%m-%d")
max_date = cv_prediction["prediction_time"].max().strftime("%Y-%m-%d")
fold_label = f"#{i+1} - {min_date} to {max_date}"
# Create residuals and time binning columns
residuals_detailed = cv_prediction.with_columns(
[
(pl.col("predicted_load_mw") - pl.col("load_mw")).alias("residual"),
time_extractor,
]
)
# Calculate statistics for this CV fold
residuals_stats = (
residuals_detailed.group_by(time_column)
.agg(
[
pl.col("residual").mean().round(1).alias("mean_residual"),
pl.col("residual").quantile(0.25).round(1).alias("q25_residual"),
pl.col("residual").quantile(0.75).round(1).alias("q75_residual"),
]
)
.sort(time_column)
.with_columns(pl.lit(fold_label).alias("fold_label"))
)
# Store time range for perfect line (use the first CV fold)
if time_range is None:
time_range = (
residuals_stats[time_column].min(),
residuals_stats[time_column].max(),
)
else:
time_range = (
min(time_range[0], residuals_stats[time_column].min()),
max(time_range[1], residuals_stats[time_column].max()),
)
# Create IQR band for this CV fold
iqr_band = (
altair.Chart(residuals_stats)
.mark_area(opacity=0.15)
.encode(
x=altair.X(f"{time_column}:O", title=x_title),
y=altair.Y("q25_residual:Q"),
y2=altair.Y2("q75_residual:Q"),
)
)
# Create mean line for this CV fold
mean_line = (
altair.Chart(residuals_stats)
.mark_line(tooltip=True, point=True, opacity=0.8)
.encode(
x=altair.X(f"{time_column}:O", title=x_title),
y=altair.Y("mean_residual:Q", title="Mean residual (MW)"),
color=altair.Color("fold_label:N", legend=None),
detail="fold_label:N",
)
)
all_iqr_bands.append(iqr_band)
all_mean_lines.append(mean_line)
# Create perfect residuals line at y=0
perfect_line = (
altair.Chart(
pl.DataFrame(
{
time_column: [time_range[0], time_range[1]],
"perfect_residual": [0, 0],
"label": ["Perfect"] * 2,
}
)
)
.mark_line(strokeDash=[5, 5], opacity=0.8, color="black")
.encode(
x=altair.X(f"{time_column}:O", title=x_title),
y=altair.Y("perfect_residual:Q", title="Mean residual (MW)"),
color=altair.Color(
"label:N",
scale=altair.Scale(range=["black"]),
legend=None,
),
)
)
# Combine all IQR bands
combined_iqr = all_iqr_bands[0]
for band in all_iqr_bands[1:]:
combined_iqr += band
# Combine all mean lines
combined_lines = all_mean_lines[0]
for line in all_mean_lines[1:]:
combined_lines += line
# Layer the IQR bands behind the mean lines, with perfect line on top
return (combined_iqr + combined_lines + perfect_line).resolve_scale(
color="independent"
)
plot_binned_residuals(cv_predictions, by="hour").interactive().properties(
title="Residuals by hour of the day from cross-validation predictions"
)
plot_binned_residuals(cv_predictions, by="month").interactive().properties(
title="Residuals by hour of the day from cross-validation predictions"
)
ts_cv_2 = TimeSeriesSplit(
n_splits=2, test_size=test_size, max_train_size=max_train_size, gap=24
)
randomized_search_ridge = hgbr_predictions.skb.get_randomized_search(
cv=ts_cv_2,
scoring="r2",
n_iter=100,
fitted=True,
verbose=1,
n_jobs=-1,
)
Fitting 2 folds for each of 100 candidates, totalling 200 fits
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[36], line 4
1 ts_cv_2 = TimeSeriesSplit(
2 n_splits=2, test_size=test_size, max_train_size=max_train_size, gap=24
3 )
----> 4 randomized_search_ridge = hgbr_predictions.skb.get_randomized_search(
5 cv=ts_cv_2,
6 scoring="r2",
7 n_iter=100,
8 fitted=True,
9 verbose=1,
10 n_jobs=-1,
11 )
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/skrub/_expressions/_skrub_namespace.py:1736, in SkrubNamespace.get_randomized_search(self, fitted, keep_subsampling, **kwargs)
1734 if not fitted:
1735 return search
-> 1736 return search.fit(
1737 env_with_subsampling(self._expr, self.get_data(), keep_subsampling)
1738 )
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/skrub/_expressions/_estimator.py:682, in ParamSearch.fit(self, environment)
680 search.param_distributions = param_grid
681 X, y = _compute_Xy(self.expr, environment)
--> 682 search.fit(X, y)
683 _copy_attr(search, self, _SKLEARN_SEARCH_FITTED_ATTRIBUTES_TO_COPY)
684 try:
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/base.py:1363, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
1356 estimator._validate_params()
1358 with config_context(
1359 skip_parameter_validation=(
1360 prefer_skip_nested_validation or global_skip_validation
1361 )
1362 ):
-> 1363 return fit_method(estimator, *args, **kwargs)
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1051, in BaseSearchCV.fit(self, X, y, **params)
1045 results = self._format_results(
1046 all_candidate_params, n_splits, all_out, all_more_results
1047 )
1049 return results
-> 1051 self._run_search(evaluate_candidates)
1053 # multimetric is determined here because in the case of a callable
1054 # self.scoring the return type is only known after calling
1055 first_test_score = all_out[0]["test_scores"]
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1992, in RandomizedSearchCV._run_search(self, evaluate_candidates)
1990 def _run_search(self, evaluate_candidates):
1991 """Search n_iter candidates from param_distributions"""
-> 1992 evaluate_candidates(
1993 ParameterSampler(
1994 self.param_distributions, self.n_iter, random_state=self.random_state
1995 )
1996 )
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/model_selection/_search.py:997, in BaseSearchCV.fit.<locals>.evaluate_candidates(candidate_params, cv, more_results)
989 if self.verbose > 0:
990 print(
991 "Fitting {0} folds for each of {1} candidates,"
992 " totalling {2} fits".format(
993 n_splits, n_candidates, n_candidates * n_splits
994 )
995 )
--> 997 out = parallel(
998 delayed(_fit_and_score)(
999 clone(base_estimator),
1000 X,
1001 y,
1002 train=train,
1003 test=test,
1004 parameters=parameters,
1005 split_progress=(split_idx, n_splits),
1006 candidate_progress=(cand_idx, n_candidates),
1007 **fit_and_score_kwargs,
1008 )
1009 for (cand_idx, parameters), (split_idx, (train, test)) in product(
1010 enumerate(candidate_params),
1011 enumerate(cv.split(X, y, **routed_params.splitter.split)),
1012 )
1013 )
1015 if len(out) < 1:
1016 raise ValueError(
1017 "No fits were performed. "
1018 "Was the CV iterator empty? "
1019 "Were there no candidates?"
1020 )
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/sklearn/utils/parallel.py:82, in Parallel.__call__(self, iterable)
73 warning_filters = warnings.filters
74 iterable_with_config_and_warning_filters = (
75 (
76 _with_config_and_warning_filters(delayed_func, config, warning_filters),
(...) 80 for delayed_func, args, kwargs in iterable
81 )
---> 82 return super().__call__(iterable_with_config_and_warning_filters)
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/joblib/parallel.py:2072, in Parallel.__call__(self, iterable)
2066 # The first item from the output is blank, but it makes the interpreter
2067 # progress until it enters the Try/Except block of the generator and
2068 # reaches the first `yield` statement. This starts the asynchronous
2069 # dispatch of the tasks to the workers.
2070 next(output)
-> 2072 return output if self.return_generator else list(output)
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/joblib/parallel.py:1682, in Parallel._get_outputs(self, iterator, pre_dispatch)
1679 yield
1681 with self._backend.retrieval_context():
-> 1682 yield from self._retrieve()
1684 except GeneratorExit:
1685 # The generator has been garbage collected before being fully
1686 # consumed. This aborts the remaining tasks if possible and warn
1687 # the user if necessary.
1688 self._exception = True
File ~/work/forecasting/forecasting/.pixi/envs/doc/lib/python3.12/site-packages/joblib/parallel.py:1800, in Parallel._retrieve(self)
1789 if self.return_ordered:
1790 # Case ordered: wait for completion (or error) of the next job
1791 # that have been dispatched and not retrieved yet. If no job
(...) 1795 # control only have to be done on the amount of time the next
1796 # dispatched job is pending.
1797 if (nb_jobs == 0) or (
1798 self._jobs[0].get_status(timeout=self.timeout) == TASK_PENDING
1799 ):
-> 1800 time.sleep(0.01)
1801 continue
1803 elif nb_jobs == 0:
1804 # Case unordered: jobs are added to the list of jobs to
1805 # retrieve `self._jobs` only once completed or in error, which
(...) 1811 # timeouts before any other dispatched job has completed and
1812 # been added to `self._jobs` to be retrieved.
KeyboardInterrupt:
randomized_search_ridge.results_.round(3)
randomized_search_ridge.plot_results().update_layout(margin=dict(l=150))
# nested_cv_results = skrub.cross_validate(
# environment=predictions.skb.get_data(),
# pipeline=randomized_search,
# cv=ts_cv_5,
# scoring={
# "r2": get_scorer("r2"),
# "mape": make_scorer(mean_absolute_percentage_error),
# },
# n_jobs=-1,
# return_pipeline=True,
# ).round(3)
# nested_cv_results
# for outer_fold_idx in range(len(nested_cv_results)):
# print(
# nested_cv_results.loc[outer_fold_idx, "pipeline"]
# .results_.loc[0]
# .round(3)
# .to_dict()
# )
# TODO: Exercise applying a linear model with some additional feature engineering
from sklearn.linear_model import Ridge
from sklearn.kernel_approximation import Nystroem
model = skrub.tabular_learner(
estimator=Ridge(
alpha=skrub.choose_float(1e-6, 1e6, log=True, name="alpha", default=1e-3)
)
)
model.steps.insert(
-1,
(
"nystroem",
Nystroem(
n_components=skrub.choose_int(
10, 200, log=True, name="n_components", default=150
)
),
),
)
predictions_ridge = features_with_dropped_cols.skb.apply(model, y=target)
predictions_ridge
altair.Chart(
pl.concat(
[
targets.skb.eval(),
predictions_ridge.rename(
{target_column_name: predicted_target_column_name}
).skb.eval(),
],
how="horizontal",
).tail(24 * 7)
).transform_fold(
[target_column_name, predicted_target_column_name],
).mark_line(
tooltip=True
).encode(
x="prediction_time:T", y="value:Q", color="key:N"
).interactive()
randomized_search_ridge = predictions_ridge.skb.get_randomized_search(
cv=ts_cv_2,
scoring="r2",
n_iter=100,
fitted=True,
verbose=1,
n_jobs=-1,
)
randomized_search_ridge.plot_results().update_layout(margin=dict(l=200))
We observe that the default values of the hyper-parameters are in the optimal region explored by the randomized search. This is a good sign that the model is well-specified and that the hyper-parameters are not too sensitive to small changes of those values.
We could further assess the stability of those optimal hyper-parameters by running a nested cross-validation, where we would perform a randomized search for each fold of the outer cross-validation loop as below but this is computationally expensive.
# nested_cv_results_ridge = skrub.cross_validate(
# environment=predictions_ridge.skb.get_data(),
# pipeline=randomized_search_ridge,
# cv=ts_cv_5,
# scoring={
# "r2": get_scorer("r2"),
# "mape": make_scorer(mean_absolute_percentage_error),
# },
# n_jobs=-1,
# return_pipeline=True,
# ).round(3)
# nested_cv_results_ridge.round(3)
cv_results_ridge = predictions_ridge.skb.cross_validate(
cv=ts_cv_5,
scoring={
"r2": get_scorer("r2"),
"mape": make_scorer(mean_absolute_percentage_error),
},
return_train_score=True,
return_pipeline=True,
verbose=1,
n_jobs=-1,
)
cv_predictions_ridge = collect_cv_predictions(
cv_results_ridge["pipeline"], ts_cv_5, predictions_ridge, prediction_time
)
plot_lorenz_curve(cv_predictions_ridge, n_samples=500).interactive()
plot_reliability_diagram(cv_predictions_ridge).interactive().properties(
title="Reliability diagram from cross-validation predictions"
)
from sklearn.multioutput import MultiOutputRegressor
multioutput_predictions = features_with_dropped_cols.skb.apply(
MultiOutputRegressor(
estimator=HistGradientBoostingRegressor(random_state=0), n_jobs=-1
),
y=targets.skb.drop(cols=["prediction_time", "load_mw"]).skb.mark_as_y(),
).skb.set_name("multioutput_gbdt")
target_column_names = [target_column_name_pattern.format(horizon=h) for h in horizons]
predicted_target_column_names = [
f"predicted_{target_column_name}" for target_column_name in target_column_names
]
named_predictions = multioutput_predictions.rename(
{k: v for k, v in zip(target_column_names, predicted_target_column_names)}
)
import datetime
def plot_horizon_forecast(
targets, named_predictions, plot_at_time, historical_timedelta
):
"""Plot the true target and the forecast values."""
merged_data = targets.skb.select(cols=["prediction_time", "load_mw"]).skb.concat(
[named_predictions], axis=1
)
start_time = plot_at_time - historical_timedelta
end_time = plot_at_time + datetime.timedelta(
hours=named_predictions.skb.eval().shape[1]
)
true_values_past = merged_data.filter(
pl.col("prediction_time").is_between(start_time, plot_at_time, closed="both")
).rename({"load_mw": "Past true load"})
true_values_future = merged_data.filter(
pl.col("prediction_time").is_between(plot_at_time, end_time, closed="both")
).rename({"load_mw": "Future true load"})
predicted_record = (
merged_data.skb.select(
cols=skrub.selectors.filter_names(str.startswith, "predict")
)
.row(by_predicate=pl.col("prediction_time") == plot_at_time, named=True)
.skb.eval()
)
forecast_values = pl.DataFrame(
{
"prediction_time": predicted_record["prediction_time"]
+ datetime.timedelta(hours=horizon),
"Forecast load": predicted_record[
"predicted_" + target_column_name_pattern.format(horizon=horizon)
],
}
for horizon in range(1, len(predicted_record))
)
true_values_past_chart = (
altair.Chart(true_values_past.skb.eval())
.transform_fold(["Past true load"])
.mark_line(tooltip=True)
.encode(x="prediction_time:T", y="Past true load:Q", color="key:N")
)
true_values_future_chart = (
altair.Chart(true_values_future.skb.eval())
.transform_fold(["Future true load"])
.mark_line(tooltip=True)
.encode(x="prediction_time:T", y="Future true load:Q", color="key:N")
)
forecast_values_chart = (
altair.Chart(forecast_values)
.transform_fold(["Forecast load"])
.mark_line(tooltip=True)
.encode(x="prediction_time:T", y="Forecast load:Q", color="key:N")
)
return (
true_values_past_chart + true_values_future_chart + forecast_values_chart
).interactive()
plot_at_time = datetime.datetime(2025, 5, 24, 0, 0, tzinfo=datetime.timezone.utc)
historical_timedelta = datetime.timedelta(hours=24 * 5)
plot_horizon_forecast(targets, named_predictions, plot_at_time, historical_timedelta)
plot_at_time = datetime.datetime(2025, 5, 25, 0, 0, tzinfo=datetime.timezone.utc)
plot_horizon_forecast(targets, named_predictions, plot_at_time, historical_timedelta)
from sklearn.metrics import r2_score
def multioutput_scorer(regressor, X, y, score_func, score_name):
y_pred = regressor.predict(X)
return {
f"{score_name}_horizon_{h}h": score
for h, score in enumerate(
score_func(y, y_pred, multioutput="raw_values"), start=1
)
}
def scoring(regressor, X, y):
return {
**multioutput_scorer(regressor, X, y, mean_absolute_percentage_error, "mape"),
**multioutput_scorer(regressor, X, y, r2_score, "r2"),
}
multioutput_cv_results = multioutput_predictions.skb.cross_validate(
cv=ts_cv_5,
scoring=scoring,
return_train_score=True,
verbose=1,
n_jobs=-1,
).round(3)
multioutput_cv_results
import itertools
from IPython.display import display
for metric_name, dataset_type in itertools.product(["mape", "r2"], ["train", "test"]):
columns = multioutput_cv_results.columns[
multioutput_cv_results.columns.str.startswith(f"{dataset_type}_{metric_name}")
]
data_to_plot = multioutput_cv_results[columns]
data_to_plot.columns = [
col.replace(f"{dataset_type}_", "")
.replace(f"{metric_name}_", "")
.replace("_", " ")
for col in columns
]
data_long = data_to_plot.melt(var_name="horizon", value_name="score")
chart = (
altair.Chart(
data_long,
title=f"{dataset_type.title()} {metric_name.upper()} Scores by Horizon",
)
.mark_boxplot(extent="min-max")
.encode(
x=altair.X(
"horizon:N",
title="Horizon",
sort=altair.Sort(
[f"horizon {h}h" for h in range(1, data_to_plot.shape[1])]
),
),
y=altair.Y("score:Q", title=f"{metric_name.upper()} Score"),
color=altair.Color("horizon:N", legend=None),
)
)
display(chart)
# TODO: Exercise using RandomForestRegressor
from sklearn.ensemble import RandomForestRegressor
multioutput_predictions_rf = features_with_dropped_cols.skb.apply(
RandomForestRegressor(min_samples_leaf=30, random_state=0, n_jobs=-1),
y=targets.skb.drop(cols=["prediction_time", "load_mw"]).skb.mark_as_y(),
).skb.set_name("random_forest")
named_predictions_rf = multioutput_predictions_rf.rename(
{k: v for k, v in zip(target_column_names, predicted_target_column_names)}
)
plot_at_time = datetime.datetime(2025, 5, 24, 0, 0, tzinfo=datetime.timezone.utc)
historical_timedelta = datetime.timedelta(hours=24 * 5)
plot_horizon_forecast(targets, named_predictions_rf, plot_at_time, historical_timedelta)
plot_at_time = datetime.datetime(2025, 5, 25, 0, 0, tzinfo=datetime.timezone.utc)
plot_horizon_forecast(targets, named_predictions_rf, plot_at_time, historical_timedelta)
multioutput_cv_results_rf = multioutput_predictions_rf.skb.cross_validate(
cv=ts_cv_5,
scoring=scoring,
return_train_score=True,
verbose=1,
n_jobs=-1,
)
multioutput_cv_results_rf.round(3)
import itertools
from IPython.display import display
for metric_name, dataset_type in itertools.product(["mape", "r2"], ["train", "test"]):
columns = multioutput_cv_results_rf.columns[
multioutput_cv_results.columns.str.startswith(f"{dataset_type}_{metric_name}")
]
data_to_plot = multioutput_cv_results_rf[columns]
data_to_plot.columns = [
col.replace(f"{dataset_type}_", "")
.replace(f"{metric_name}_", "")
.replace("_", " ")
for col in columns
]
data_long = data_to_plot.melt(var_name="horizon", value_name="score")
chart = (
altair.Chart(
data_long,
title=f"{dataset_type.title()} {metric_name.upper()} Scores by Horizon",
)
.mark_boxplot(extent="min-max")
.encode(
x=altair.X(
"horizon:N",
title="Horizon",
sort=altair.Sort(
[f"horizon {h}h" for h in range(1, data_to_plot.shape[1])]
),
),
y=altair.Y("score:Q", title=f"{metric_name.upper()} Score"),
color=altair.Color("horizon:N", legend=None),
)
)
display(chart)
from sklearn.metrics import mean_pinball_loss
scoring = {
"r2": get_scorer("r2"),
"mape": make_scorer(mean_absolute_percentage_error),
"mean_pinball_05_loss": make_scorer(mean_pinball_loss, alpha=0.05),
"mean_pinball_50_loss": make_scorer(mean_pinball_loss, alpha=0.5),
"mean_pinball_95_loss": make_scorer(mean_pinball_loss, alpha=0.95),
}
common_params = dict(
loss="quantile", learning_rate=0.1, max_leaf_nodes=100, random_state=0
)
predictions_gbrt_05 = features_with_dropped_cols.skb.apply(
HistGradientBoostingRegressor(**common_params, quantile=0.05),
y=target,
)
predictions_gbrt_50 = features_with_dropped_cols.skb.apply(
HistGradientBoostingRegressor(**common_params, quantile=0.5),
y=target,
)
predictions_gbrt_95 = features_with_dropped_cols.skb.apply(
HistGradientBoostingRegressor(**common_params, quantile=0.95),
y=target,
)
cv_results_05 = predictions_gbrt_05.skb.cross_validate(
cv=ts_cv_5,
scoring=scoring,
return_pipeline=True,
verbose=1,
n_jobs=-1,
)
cv_results_50 = predictions_gbrt_50.skb.cross_validate(
cv=ts_cv_5,
scoring=scoring,
return_pipeline=True,
verbose=1,
n_jobs=-1,
)
cv_results_95 = predictions_gbrt_95.skb.cross_validate(
cv=ts_cv_5,
scoring=scoring,
return_pipeline=True,
verbose=1,
n_jobs=-1,
)
cv_results_05[[col for col in cv_results_05.columns if col.startswith("test_")]].mean(
axis=0
).round(3)
cv_results_50[[col for col in cv_results_50.columns if col.startswith("test_")]].mean(
axis=0
).round(3)
cv_results_95[[col for col in cv_results_95.columns if col.startswith("test_")]].mean(
axis=0
).round(3)
results = pl.concat(
[
targets.skb.select(cols=["prediction_time", target_column_name]).skb.eval(),
predictions_gbrt_05.rename({target_column_name: "quantile_05"}).skb.eval(),
predictions_gbrt_50.rename({target_column_name: "median"}).skb.eval(),
predictions_gbrt_95.rename({target_column_name: "quantile_95"}).skb.eval(),
],
how="horizontal",
).tail(24 * 7)
median_chart = (
altair.Chart(results)
.transform_fold([target_column_name, "median"])
.mark_line(tooltip=True)
.encode(x="prediction_time:T", y="value:Q", color="key:N")
)
quantile_band_chart = (
altair.Chart(results)
.mark_area(opacity=0.4, tooltip=True)
.encode(
x="prediction_time:T",
y="quantile_05:Q",
y2="quantile_95:Q",
color=altair.value("lightgreen"),
)
)
combined_chart = quantile_band_chart + median_chart
combined_chart.interactive()
cv_predictions_05 = collect_cv_predictions(
cv_results_05["pipeline"], ts_cv_5, predictions_gbrt_05, prediction_time
)
cv_predictions_50 = collect_cv_predictions(
cv_results_50["pipeline"], ts_cv_5, predictions_gbrt_50, prediction_time
)
cv_predictions_95 = collect_cv_predictions(
cv_results_95["pipeline"], ts_cv_5, predictions_gbrt_95, prediction_time
)
plot_residuals_vs_predicted(cv_predictions_05).interactive().properties(
title=(
"Residuals vs Predicted Values from cross-validation predictions"
" for quantile 0.05"
)
)
plot_residuals_vs_predicted(cv_predictions_50).interactive().properties(
title=(
"Residuals vs Predicted Values from cross-validation predictions" " for median"
)
)
plot_residuals_vs_predicted(cv_predictions_95).interactive().properties(
title=(
"Residuals vs Predicted Values from cross-validation predictions"
" for quantile 0.95"
)
)
cv_predictions_05_concat = pl.concat(cv_predictions_05, how="vertical")
cv_predictions_50_concat = pl.concat(cv_predictions_50, how="vertical")
cv_predictions_95_concat = pl.concat(cv_predictions_95, how="vertical")
import matplotlib.pyplot as plt
from sklearn.metrics import PredictionErrorDisplay
for kind in ["actual_vs_predicted", "residual_vs_predicted"]:
fig, axs = plt.subplots(1, 3, figsize=(15, 5), sharey=True)
PredictionErrorDisplay.from_predictions(
y_true=cv_predictions_05_concat["load_mw"].to_numpy(),
y_pred=cv_predictions_05_concat["predicted_load_mw"].to_numpy(),
kind=kind,
ax=axs[0],
)
axs[0].set_title("0.05 quantile regression")
PredictionErrorDisplay.from_predictions(
y_true=cv_predictions_50_concat["load_mw"].to_numpy(),
y_pred=cv_predictions_50_concat["predicted_load_mw"].to_numpy(),
kind=kind,
ax=axs[1],
)
axs[1].set_title("Median regression")
PredictionErrorDisplay.from_predictions(
y_true=cv_predictions_95_concat["load_mw"].to_numpy(),
y_pred=cv_predictions_95_concat["predicted_load_mw"].to_numpy(),
kind=kind,
ax=axs[2],
)
axs[2].set_title("0.95 quantile regression")
fig.suptitle(f"{kind} for GBRT minimzing different quantile losses")
def coverage(y_true, y_quantile_low, y_quantile_high):
y_true = np.asarray(y_true)
y_quantile_low = np.asarray(y_quantile_low)
y_quantile_high = np.asarray(y_quantile_high)
return float(
np.logical_and(y_true >= y_quantile_low, y_true <= y_quantile_high)
.mean()
.round(4)
)
def mean_width(y_true, y_quantile_low, y_quantile_high):
y_true = np.asarray(y_true)
y_quantile_low = np.asarray(y_quantile_low)
y_quantile_high = np.asarray(y_quantile_high)
return float(np.abs(y_quantile_high - y_quantile_low).mean().round(1))
def binned_coverage(y_true_folds, y_quantile_low, y_quantile_high, n_bins=10):
"""Compute coverage after binning true values using quantile-based binning.
Parameters
----------
y_true_folds : list of numpy.ndarray
List of true target values, one array per CV fold
y_quantile_low : list of numpy.ndarray
List of lower quantile predictions, one array per CV fold
y_quantile_high : list of numpy.ndarray
List of upper quantile predictions, one array per CV fold
n_bins : int, default=10
Number of bins to create
Returns
-------
pandas.DataFrame
DataFrame with columns: bin_left, bin_right, bin_center, fold_idx,
coverage, mean_width, n_samples
"""
# Use all true values to define global bin boundaries
all_true_values = np.concatenate(y_true_folds)
df = pd.DataFrame({"bin_by": all_true_values})
df["bin"] = pd.qcut(df["bin_by"], q=n_bins, labels=False, duplicates="drop")
# Get bin boundaries for consistent binning across folds
bin_boundaries = []
for bin_idx in sorted(df["bin"].dropna().unique()):
bin_mask = df["bin"] == bin_idx
bin_values = df.loc[bin_mask, "bin_by"]
bin_boundaries.append((bin_values.min(), bin_values.max()))
results = []
n_folds = len(y_quantile_low)
for fold_idx in range(n_folds):
fold_true = y_true_folds[fold_idx]
fold_low = y_quantile_low[fold_idx]
fold_high = y_quantile_high[fold_idx]
# Assign each sample in this fold to a bin
fold_bins = (
np.digitize(fold_true, bins=[b[0] for b in bin_boundaries] + [np.inf]) - 1
)
for bin_idx, (bin_left, bin_right) in enumerate(bin_boundaries):
# Get samples from this fold that fall into this bin
bin_mask = fold_bins == bin_idx
if np.sum(bin_mask) == 0:
# No samples in this bin for this fold
continue
fold_bin_true = fold_true[bin_mask]
fold_bin_low = fold_low[bin_mask]
fold_bin_high = fold_high[bin_mask]
bin_center = (bin_left + bin_right) / 2
n_samples_in_bin = len(fold_bin_true)
coverage_score = coverage(fold_bin_true, fold_bin_low, fold_bin_high)
width = mean_width(fold_bin_true, fold_bin_low, fold_bin_high)
results.append(
{
"bin_left": bin_left,
"bin_right": bin_right,
"bin_center": bin_center,
"fold_idx": fold_idx,
"coverage": coverage_score,
"mean_width": width,
"n_samples": n_samples_in_bin,
}
)
return pd.DataFrame(results)
coverage(
cv_predictions_50_concat["load_mw"].to_numpy(),
cv_predictions_05_concat["predicted_load_mw"].to_numpy(),
cv_predictions_95_concat["predicted_load_mw"].to_numpy(),
)
mean_width(
cv_predictions_50_concat["load_mw"].to_numpy(),
cv_predictions_05_concat["predicted_load_mw"].to_numpy(),
cv_predictions_95_concat["predicted_load_mw"].to_numpy(),
)
# Compute binned coverage scores
binned_coverage_results = binned_coverage(
[df["load_mw"].to_numpy() for df in cv_predictions_50],
[df["predicted_load_mw"].to_numpy() for df in cv_predictions_05],
[df["predicted_load_mw"].to_numpy() for df in cv_predictions_95],
n_bins=10,
)
binned_coverage_results
coverage_by_bin = binned_coverage_results.copy()
coverage_by_bin["bin_label"] = coverage_by_bin.apply(
lambda row: f"[{row.bin_left:.0f}, {row.bin_right:.0f}]", axis=1
)
Reliability diagram for quantile regression#
plot_reliability_diagram(
cv_predictions_50, kind="quantile", quantile_level=0.50
).interactive().properties(
title="Reliability diagram for quantile 0.50 from cross-validation predictions"
)
plot_reliability_diagram(
cv_predictions_05, kind="quantile", quantile_level=0.05
).interactive().properties(
title="Reliability diagram for quantile 0.05 from cross-validation predictions"
)
plot_reliability_diagram(
cv_predictions_95, kind="quantile", quantile_level=0.95
).interactive().properties(
title="Reliability diagram for quantile 0.95 from cross-validation predictions"
)
ax = coverage_by_bin.boxplot(
column="coverage", by="bin_label", figsize=(12, 6), vert=False, whis=1000
)
ax.axvline(x=0.9, color="red", linestyle="--", label="Target coverage (0.9)")
ax.set_xlabel("Load bins (MW)")
ax.set_ylabel("Coverage")
ax.set_title("Coverage Distribution by Load Bins")
ax.legend()
plt.suptitle("") # Remove automatic suptitle from boxplot
plt.xticks(rotation=45)
plt.tight_layout()
Quantile regression as classification#
In the following, we turn a quantile regression problem for all possible quantile levels into a multiclass classification problem by discretizing the target variable into bins and interpolating the cumulative sum of the bin membership probability to estimate the CDF of the distribution of the continuous target variable conditioned on the features.
Ideally, the classifier should be efficient when trained on a large number of classes (induced by the number of bins). Therefore we use a Random Forest classifier as the default base estimator.
from scipy.interpolate import interp1d
from sklearn.base import BaseEstimator, RegressorMixin, clone
from sklearn.utils.validation import check_is_fitted
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import KBinsDiscretizer
from sklearn.utils.validation import check_consistent_length
from sklearn.utils import check_random_state
import numpy as np
class BinnedQuantileRegressor(BaseEstimator, RegressorMixin):
def __init__(
self,
estimator=None,
n_bins=100,
quantile=0.5,
random_state=None,
):
self.n_bins = n_bins
self.estimator = estimator
self.quantile = quantile
self.random_state = random_state
def fit(self, X, y):
# Lightweight input validation: most of the input validation will be
# handled by the sub estimators.
random_state = check_random_state(self.random_state)
check_consistent_length(X, y)
self.target_binner_ = KBinsDiscretizer(
n_bins=self.n_bins,
strategy="quantile",
subsample=200_000,
encode="ordinal",
random_state=random_state,
)
y_binned = (
self.target_binner_.fit_transform(np.asarray(y).reshape(-1, 1))
.ravel()
.astype(np.int32)
)
# Fit the multiclass classifier to predict the binned targets from the
# training set.
if self.estimator is None:
estimator = RandomForestClassifier(random_state=random_state)
else:
estimator = clone(self.estimator)
self.estimator_ = estimator.fit(X, y_binned)
return self
def predict_quantiles(self, X, quantiles=(0.05, 0.5, 0.95)):
check_is_fitted(self, "estimator_")
edges = self.target_binner_.bin_edges_[0]
n_bins = edges.shape[0] - 1
expected_shape = (X.shape[0], n_bins)
y_proba_raw = self.estimator_.predict_proba(X)
# Some might stay empty on the training set. Typically, classifiers do
# not learn to predict an explicit 0 probability for unobserved classes
# so we have to post process their output:
if y_proba_raw.shape != expected_shape:
y_proba = np.zeros(shape=expected_shape)
y_proba[:, self.estimator_.classes_] = y_proba_raw
else:
y_proba = y_proba_raw
# Build the mapper for inverse CDF mapping, from cumulated
# probabilities to continuous prediction.
y_cdf = np.zeros(shape=(X.shape[0], edges.shape[0]))
y_cdf[:, 1:] = np.cumsum(y_proba, axis=1)
return np.asarray([interp1d(y_cdf_i, edges)(quantiles) for y_cdf_i in y_cdf])
def predict(self, X):
return self.predict_quantiles(X, self.quantile).ravel()